MACHINE LEARNING RECIPES
DATA CLEANING PYTHON
DATA MUNGING
PANDAS CHEATSHEET
ALL TAGS
# How to compare different classification models using logloss and how to pick the best one?

# How to compare different classification models using logloss and how to pick the best one?

This recipe helps you compare different classification models using logloss and how to pick the best one

How to compare different classification models using logloss and how to pick the best one

LOG loss is useful when we have to compare models, It compares the model mainly in two ways by their outputs and their probabilistic outcome.

* To calculate LOG loss the classifier assigns the probability to each class.

* LOG loss starts to measures the uncertainity of the model of every sample and it compares with the true labels and in return penalises the false classification.

* LOG loss has the ability to get defined for two or more labels

* LOG loss nearer to 0 means higher accuracy away from zero means lower accuracy. LOG loss has the range between 0 to infinity.

If there are N samples belonging to M classes :

1.) yij , indicates whether sample i belongs to class j or not

2.) pij , indicates the probability of sample i belonging to class j

The negative sign negates log(yij^) output which is always negative. yij^ outputs a probability (0 - 1). log(x) is nagative if 0 < x < 1.

```
from sklearn.model_selection import train_test_split, cross_val_score, cross_val_predict
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.linear_model import LogisticRegression
```

We will import the dataset directly through seaborn library.

```
iris = sns.load_dataset('iris')
X=iris.drop(columns='species')
y=iris['species']
Xtrain, Xtest, ytrain, ytest= train_test_split(X,y, test_size=0.3, random_state=20)
```

We will start the fit the Machine Learning Model.

```
# Logistic Regression
clf_logreg = LogisticRegression()
# fit model
clf_logreg.fit(Xtrain, ytrain)
```

we will calculate the LOG LOSS score.

```
logloss_logreg = cross_val_score(clf_logreg, Xtrain, ytrain, scoring = 'neg_log_loss').mean()
print(logloss_logreg)
```

This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

In this human activity recognition project, we use multiclass classification machine learning techniques to analyse fitness dataset from a smartphone tracker.

Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.