How to use Classification Metrics in Python?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to use Classification Metrics in Python?

How to use Classification Metrics in Python?

This recipe helps you use Classification Metrics in Python

Recipe Objective

In a dataset after applying a Classification model how to evaluate it. There are many metrics that we can use. We will be using accuracy , logarithmic loss and Area under ROC.

So this is the recipe on how we we can use Classification Metrics in Python.

Step 1 - Import the library

from sklearn import datasets from sklearn import tree, model_selection, metrics from sklearn.model_selection import train_test_split

We have imported datasets, tree, model_selection and test_train_split which will be needed for the dataset.

Step 2 - Setting up the Data

We have imported inbuilt wine dataset and stored data in x and target in y. We have used to split the data by test train split. Then we have used model_selection.KFold. seed = 42 dataset = datasets.load_breast_cancer() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25) kfold = model_selection.KFold(n_splits=10, random_state=seed) kfold = model_selection.KFold(n_splits=10, random_state=seed)

Step 3 - Training model and calculating Metrics

Here we will be using DecisionTreeClassifier as a model model = tree.DecisionTreeClassifier() Now we will be calculating different metrics. We will be using cross validation score to calculate the metrices. So we will be printing the mean and standard deviation of all the scores.

  • Calculating Accuracy
  • scoring = "accuracy" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print("Accuracy: ", results.mean()); print("Standard Deviation: ", results.std())
  • Calculating Logarithmic Loss
  • scoring = "neg_log_loss" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print("Logloss: ", results.mean()); print("Standard Deviation: ", results.std())
  • Calculating Area under ROC curve
  • scoring = "roc_auc" results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) print(); print("AUC: ", results.mean()); print("Standard Deviation: ", results.std())
So the output comes as:

Accuracy:  0.9248615725359912
Standard Deviation:  0.03454639234547574

Logloss:  -2.675538335423929
Standard Deviation:  1.2623224750420183

AUC:  0.9168731849436718
Standard Deviation:  0.027925303925433888
‚Äč

Download Materials

Relevant Projects

Word2Vec and FastText Word Embedding with Gensim in Python
In this NLP Project, you will learn how to use the popular topic modelling library Gensim for implementing two state-of-the-art word embedding methods Word2Vec and FastText models.

Multi-Class Text Classification with Deep Learning using BERT
In this project, we will cover in detail the architecture of a transformer used in natural language processing use cases. We will go through the key nlp areas in the pre-transformer stage like bow, word2vec...and then the origin and gradual refinement of transformers. Finally, we will study one of the most popular state of the art transformer models, called BERT and use it for text classification on a large dataset.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Census Income Data Set Project - Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based on census data.

Image Segmentation using Mask R-CNN with Tensorflow
In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Digit Recognition using CNN for MNIST Dataset in Python
In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Classification of T shirt images to see if they have text on them
Want to search images of clothes which have text on them? Then this project talks through how we can classify an image whether it has text on it or not. For this we use state of the model called as inception and try and deepdive into how it works on our dataset

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.