How to compare different classification models using logloss and how to pick the best one?

How to compare different classification models using logloss and how to pick the best one?

How to compare different classification models using logloss and how to pick the best one?

This recipe helps you compare different classification models using logloss and how to pick the best one


Recipe Objective

How to compare different classification models using logloss and how to pick the best one

LOG loss is useful when we have to compare models, It compares the model mainly in two ways by their outputs and their probabilistic outcome.

* To calculate LOG loss the classifier assigns the probability to each class.

* LOG loss starts to measures the uncertainity of the model of every sample and it compares with the true labels and in return penalises the false classification.

* LOG loss has the ability to get defined for two or more labels

* LOG loss nearer to 0 means higher accuracy away from zero means lower accuracy. LOG loss has the range between 0 to infinity.

If there are N samples belonging to M classes :

1.) yij , indicates whether sample i belongs to class j or not

2.) pij , indicates the probability of sample i belonging to class j

The negative sign negates log(yij^) output which is always negative. yij^ outputs a probability (0 - 1). log(x) is nagative if 0 < x < 1.

Step 1- Importing Libraries

from sklearn.model_selection import train_test_split, cross_val_score, cross_val_predict import pandas as pd import numpy as np import seaborn as sns from sklearn.linear_model import LogisticRegression

Step 2- Importing and preparing the dataset.

We will import the dataset directly through seaborn library.

iris = sns.load_dataset('iris') X=iris.drop(columns='species') y=iris['species'] Xtrain, Xtest, ytrain, ytest= train_test_split(X,y, test_size=0.3, random_state=20)

Step 3- Fitting the Model.

We will start the fit the Machine Learning Model.

# Logistic Regression clf_logreg = LogisticRegression() # fit model, ytrain)

Step 4- Calculating the LOG LOSS.

we will calculate the LOG LOSS score.

logloss_logreg = cross_val_score(clf_logreg, Xtrain, ytrain, scoring = 'neg_log_loss').mean() print(logloss_logreg)

Relevant Projects

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Music Recommendation System Project using Python and R
Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.