How to generate classification report and confusion matrix in Python?

How to generate classification report and confusion matrix in Python?

How to generate classification report and confusion matrix in Python?

This recipe helps you generate classification report and confusion matrix in Python


Recipe Objective

While using a classification problem we need to use various metrics like precision, recall, f1-score, support or others to check how efficient our model is working.

For this we need to compute there scores by classification report and confusion matrix. So in this recipie we will learn how to generate classification report and confusion matrix in Python.

This data science python source code does the following:
1. Imports necessary libraries and dataset from sklearn
2. performs train test split on the dataset
3. Applies DecisionTreeClassifier model for prediction
4. Prepares classification report for the output

Step 1 - Import the library

from sklearn import datasets from sklearn.tree import DecisionTreeClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report, confusion_matrix

We have imported datasets to use the inbuilt dataframe , DecisionTreeClassifier, train_test_split, classification_report and confusion_matrix.

Step 2 - Setting up the Data

Here we have used datasets to load the inbuilt wine dataset and we have created objects X and y to store the data and the target value respectively. wine = datasets.load_wine() X = y = We are creating a list of target names and We are using train_test_split is used to split the data into two parts, one is train which is used to train the model and the other is test which is used to check how our model is working on unseen data. Here we are passing 0.3 as a parameter in the train_test_split which will split the data such that 30% of data will be in test part and rest 70% will be in the train part.

class_names = wine.target_names X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

Step 3 - Training the model

Here we are using DecisionTreeClassifier to predict as a classification model and training it on the train data. After that predicting the output of test data. classifier_tree = DecisionTreeClassifier() y_predict =, y_train).predict(X_test)

Step 5 - Creating Classification Report and Confusion Matrix

Let us first have a look on the parameters of Classification Report:

  • y_true : In this parameter we have to pass the true target values of the data.
  • y_pred : It this parameter we have to pass the predicted output of model.
  • target_names : In this parameter we have to pass the names of target.
For Confusion Matrix there are two parameters test and predicted values of the data. print(classification_report(y_test, y_predict, target_names=class_names)) print(confusion_matrix(y_test, y_predict)) So the output comes as

              precision    recall  f1-score   support

     class_0       0.95      0.95      0.95        19
     class_1       0.95      0.95      0.95        21
     class_2       0.95      0.95      0.95        19

   micro avg       0.95      0.95      0.95        59
   macro avg       0.95      0.95      0.95        59
weighted avg       0.95      0.95      0.95        59

[[18  1  0]
 [ 0 20  1]
 [ 1  0 18]]

Relevant Projects

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Music Recommendation System Project using Python and R
Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.