How to use Classification Metrics in Python?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to use Classification Metrics in Python?

How to use Classification Metrics in Python?

This recipe helps you use Classification Metrics in Python

1

Recipe Objective

In a dataset after applying a Classification model how to evaluate it. There are many metrics that we can use. We will be using accuracy , logarithmic loss and Area under ROC.

So this is the recipe on how we we can use Classification Metrics in Python.

Step 1 - Import the library

from sklearn import datasets from sklearn import tree, model_selection, metrics from sklearn.model_selection import train_test_split

We have imported datasets, tree, model_selection and test_train_split which will be needed for the dataset.

Step 2 - Setting up the Data

We have imported inbuilt wine dataset and stored data in x and target in y. We have used to split the data by test train split. Then we have used model_selection.KFold. seed = 42 dataset = datasets.load_breast_cancer() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25) kfold = model_selection.KFold(n_splits=10, random_state=seed) kfold = model_selection.KFold(n_splits=10, random_state=seed)

Step 3 - Training model and calculating Metrics

Here we will be using DecisionTreeClassifier as a model model = tree.DecisionTreeClassifier() Now we will be calculating different metrics. We will be using cross validation score to calculate the metrices. So we will be printing the mean and standard deviation of all the scores.

  • Calculating Accuracy
  • scoring = "accuracy" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print("Accuracy: ", results.mean()); print("Standard Deviation: ", results.std())
  • Calculating Logarithmic Loss
  • scoring = "neg_log_loss" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print("Logloss: ", results.mean()); print("Standard Deviation: ", results.std())
  • Calculating Area under ROC curve
  • scoring = "roc_auc" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print(); print("AUC: ", results.mean()); print("Standard Deviation: ", results.std())
So the output comes as:

Accuracy:  0.9248615725359912
Standard Deviation:  0.03454639234547574

Logloss:  -2.675538335423929
Standard Deviation:  1.2623224750420183

AUC:  0.9168731849436718
Standard Deviation:  0.027925303925433888
​

Relevant Projects

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Sequence Classification with LSTM RNN in Python with Keras
In this project, we are going to work on Sequence to Sequence Prediction using IMDB Movie Review Dataset​ using Keras in Python.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.