How to read a confusion matrix

In this recipe, we will learn how to interpret a confusion matrix. We will learn all the terminologies associated with a confusion matrix.

How to read a confusion matrix?

A confusion matrix is a table that shows how well a classification model performs on the test data. A confusion matrix is fairly simple to understand but let us get acquainted with a few terminologies first. Consider the confusion matrix given below-



It is a confusion matrix for binary classification. Let us assume that YES stands for person testing positive for a disease and NO stands for a person not testing positive for the disease. Therefore, there are two predicted classes – YES and NO.

150 people were tested for the disease. The classifier predicted 105 people to have tested positive and the rest 45 as negative. However, in actuality, 110 people tested positive and 40 tested negative.

Terminologies –

True Positive (TP) -> Observations that were predicted YES and were actually YES.
True Negative (TN) -> Observations that were predicted NO and were actually NO.
False Positive (FP) -> Observations that were predicted YES but were actually NO.
False Negative (FN) -> Observations that were predicted NO but were actually YES.
Accuracy -> It is the measure of how correctly was the classifier able to predict.
Error Rate -> It is the measure of how incorrect was the classifier.
True Positive Rate or Recall (TPR) -> It is the measure of, how often does the classifier predicts YES when it is actually YES.
False Positive Rate (FPR) -> It is the measure of, how often does the classifier predicts YES when it is NO.
True Negative Rate (TNR) -> It is the measure of, how often does the classifier predicts NO when it is actually NO.
Precision -> TP/(TP+FP) i.e. True positives divided by predicted YES
Recall -> TP/(TP+FN) i.e. True positives divided by actual YES
F beta score -> ((1+beta2) * Precision * Recall) / (beta2 * Precision + Recall) (0.5, 1 and 2 are common values of beta)


Let us calculate the above-mentioned measures for our confusion matrix.
From the table,

TP = 100
FP = 5
TN = 35
FN = 10
Accuracy = (TP + TN) / (TP+TN+PF+FN) = (100+35)/150 = 0.9
Error Rate = (FP + FN) / (TP+TN+PF+FN) = (5+10)/150 = 0.1
TPR = TP/actual YES = 100/110 = 0.9090
FPR = FP/actual no = 5/40 = 0.125
TNR = TN/actual no = 35/40 = 0.875
Precision = TP/predicted YES = 100/105 =0.952
This is how you can derive conclusions from a confusion matrix.

To learn how to build a confusion matrix you can refer to the following tutorials –

How to build a confusion matrix in R
How to get Classification Confusion Matrix?
Classification report and Confusion matrix in python

What Users are saying..

profile image

Ray han

Tech Leader | Stanford / Yale University
linkedin profile url

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

MLOps Project to Deploy Resume Parser Model on Paperspace
In this MLOps project, you will learn how to deploy a Resume Parser Streamlit Application on Paperspace Private Cloud.

Build Regression (Linear,Ridge,Lasso) Models in NumPy Python
In this machine learning regression project, you will learn to build NumPy Regression Models (Linear Regression, Ridge Regression, Lasso Regression) from Scratch.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Learn Object Tracking (SOT, MOT) using OpenCV and Python
Get Started with Object Tracking using OpenCV and Python - Learn to implement Multiple Instance Learning Tracker (MIL) algorithm, Generic Object Tracking Using Regression Networks Tracker (GOTURN) algorithm, Kernelized Correlation Filters Tracker (KCF) algorithm, Tracking, Learning, Detection Tracker (TLD) algorithm for single and multiple object tracking from various video clips.

Time Series Project to Build a Multiple Linear Regression Model
Learn to build a Multiple linear regression model in Python on Time Series Data

Build ARCH and GARCH Models in Time Series using Python
In this Project we will build an ARCH and a GARCH model using Python

Build a Text Generator Model using Amazon SageMaker
In this Deep Learning Project, you will train a Text Generator Model on Amazon Reviews Dataset using LSTM Algorithm in PyTorch and deploy it on Amazon SageMaker.

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Build Time Series Models for Gaussian Processes in Python
Time Series Project - A hands-on approach to Gaussian Processes for Time Series Modelling in Python

Classification Projects on Machine Learning for Beginners - 1
Classification ML Project for Beginners - A Hands-On Approach to Implementing Different Types of Classification Algorithms in Machine Learning for Predictive Modelling