How to read a confusion matrix

In this recipe, we will learn how to interpret a confusion matrix. We will learn all the terminologies associated with a confusion matrix.

How to read a confusion matrix?

A confusion matrix is a table that shows how well a classification model performs on the test data. A confusion matrix is fairly simple to understand but let us get acquainted with a few terminologies first. Consider the confusion matrix given below-



It is a confusion matrix for binary classification. Let us assume that YES stands for person testing positive for a disease and NO stands for a person not testing positive for the disease. Therefore, there are two predicted classes – YES and NO.

150 people were tested for the disease. The classifier predicted 105 people to have tested positive and the rest 45 as negative. However, in actuality, 110 people tested positive and 40 tested negative.

Terminologies –

True Positive (TP) -> Observations that were predicted YES and were actually YES.
True Negative (TN) -> Observations that were predicted NO and were actually NO.
False Positive (FP) -> Observations that were predicted YES but were actually NO.
False Negative (FN) -> Observations that were predicted NO but were actually YES.
Accuracy -> It is the measure of how correctly was the classifier able to predict.
Error Rate -> It is the measure of how incorrect was the classifier.
True Positive Rate or Recall (TPR) -> It is the measure of, how often does the classifier predicts YES when it is actually YES.
False Positive Rate (FPR) -> It is the measure of, how often does the classifier predicts YES when it is NO.
True Negative Rate (TNR) -> It is the measure of, how often does the classifier predicts NO when it is actually NO.
Precision -> TP/(TP+FP) i.e. True positives divided by predicted YES
Recall -> TP/(TP+FN) i.e. True positives divided by actual YES
F beta score -> ((1+beta2) * Precision * Recall) / (beta2 * Precision + Recall) (0.5, 1 and 2 are common values of beta)


Let us calculate the above-mentioned measures for our confusion matrix.
From the table,

TP = 100
FP = 5
TN = 35
FN = 10
Accuracy = (TP + TN) / (TP+TN+PF+FN) = (100+35)/150 = 0.9
Error Rate = (FP + FN) / (TP+TN+PF+FN) = (5+10)/150 = 0.1
TPR = TP/actual YES = 100/110 = 0.9090
FPR = FP/actual no = 5/40 = 0.125
TNR = TN/actual no = 35/40 = 0.875
Precision = TP/predicted YES = 100/105 =0.952
This is how you can derive conclusions from a confusion matrix.

To learn how to build a confusion matrix you can refer to the following tutorials –

How to build a confusion matrix in R
How to get Classification Confusion Matrix?
Classification report and Confusion matrix in python

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

Build an Image Segmentation Model using Amazon SageMaker
In this Machine Learning Project, you will learn to implement the UNet Architecture and build an Image Segmentation Model using Amazon SageMaker

Learn to Build a Siamese Neural Network for Image Similarity
In this Deep Learning Project, you will learn how to build a siamese neural network with Keras and Tensorflow for Image Similarity.

Word2Vec and FastText Word Embedding with Gensim in Python
In this NLP Project, you will learn how to use the popular topic modelling library Gensim for implementing two state-of-the-art word embedding methods Word2Vec and FastText models.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Llama2 Project for MetaData Generation using FAISS and RAGs
In this LLM Llama2 Project, you will automate metadata generation using Llama2, RAGs, and AWS to reduce manual efforts.

Build Time Series Models for Gaussian Processes in Python
Time Series Project - A hands-on approach to Gaussian Processes for Time Series Modelling in Python

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Learn to Build an End-to-End Machine Learning Pipeline - Part 2
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, incorporating Hopsworks' feature store and Weights and Biases for model experimentation.

Build a Graph Based Recommendation System in Python -Part 1
Python Recommender Systems Project - Learn to build a graph based recommendation system in eCommerce to recommend products.