Generate classification report and confusion matrix in Python

In this recipe you will generate classification report and confusion matrix, also you will learn what are the required libraries for classification report generation and how to perform train test split on a dataset in Python
Last Updated: 19 Jan 2023

Get access to Data Science projects View all Data Science projects

MODEL SELECTION DATA CLEANING PYTHON DATA MUNGING MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective

While using a classification problem we need to use various metrics like precision, recall, f1-score, support or others to check how efficient our model is working.

For this we need to compute there scores by classification report and confusion matrix. So in this recipie we will learn how to generate classification report and confusion matrix in Python.

This data science python source code does the following:
1. Imports necessary libraries and dataset from sklearn
2. performs train test split on the dataset
3. Applies DecisionTreeClassifier model for prediction
4. Prepares classification report for the output

Get Closer To Your Dream of Becoming a Data Scientist with 70+ Solved End-to-End ML Projects

Recipe Objective
Step 1 - Import the library
Step 2 - Setting up the Data
Step 3 - Training the model
Step 5 - Creating Classification Report and Confusion Matrix

Step 1 - Import the library

from sklearn import datasets from sklearn.tree import DecisionTreeClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report, confusion_matrix

We have imported datasets to use the inbuilt dataframe , DecisionTreeClassifier, train_test_split, classification_report and confusion_matrix.

Step 2 - Setting up the Data

Here we have used datasets to load the inbuilt wine dataset and we have created objects X and y to store the data and the target value respectively. wine = datasets.load_wine() X = wine.data y = wine.target We are creating a list of target names and We are using train_test_split is used to split the data into two parts, one is train which is used to train the model and the other is test which is used to check how our model is working on unseen data. Here we are passing 0.3 as a parameter in the train_test_split which will split the data such that 30% of data will be in test part and rest 70% will be in the train part.

class_names = wine.target_names X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

Step 3 - Training the model

Here we are using DecisionTreeClassifier to predict as a classification model and training it on the train data. After that predicting the output of test data. classifier_tree = DecisionTreeClassifier() y_predict = classifier_tree.fit(X_train, y_train).predict(X_test)

Explore the Must Know Python Libraries for Data Science and Machine Learning.

Step 5 - Creating Classification Report and Confusion Matrix

Let us first have a look on the parameters of Classification Report:

y_true : In this parameter we have to pass the true target values of the data.
y_pred : It this parameter we have to pass the predicted output of model.
target_names : In this parameter we have to pass the names of target.

For Confusion Matrix there are two parameters test and predicted values of the data. print(classification_report(y_test, y_predict, target_names=class_names)) print(confusion_matrix(y_test, y_predict)) So the output comes as

              precision    recall  f1-score   support

     class_0       0.95      0.95      0.95        19
     class_1       0.95      0.95      0.95        21
     class_2       0.95      0.95      0.95        19

   micro avg       0.95      0.95      0.95        59
   macro avg       0.95      0.95      0.95        59
weighted avg       0.95      0.95      0.95        59

[[18  1  0]
 [ 0 20  1]
 [ 1  0 18]]

Join Millions of Satisfied Developers and Enterprises to Maximize Your Productivity and ROI with ProjectPro - Read ProjectPro Reviews Now!

Download Materials

iPython Notebook

What Users are saying..

Ed Godalle

Director Data Analytics at EY / EY Tech

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More