How to get Classification Accuracy in R

This recipe helps you get Classification Accuracy in R

Recipe Objective

How to get Classification Accuracy?

Logistic Regression is a classification type supervised learning model. Logistic Regression is used when the independent variable x, is either a continuous or categorical variable and the dependent variable (y) is a categorical variable. Accuracy is a metric used for evaluating the performance of the model when applied on a test data set. The accuracy is derived by plotting a confusion matrix. **Accuracy** — Accuracy is a measure of how much the model predicted correctly. Hence, the accuracy of our model must be as high as possible. **Accuracy — True Positive + True Negatives / (True Positive + True Negative + False Positive +False Negative)** This recipe demonstrates an example of what is classification accuracy. .

Step 1 - Install necessary libraries

install.packages('caret') install.packages('e1071') library(caret) library(e1071)

Step 2 - Generate random data

Consider an example of performing binary classification on some random data generated to classify whether a patient has cancer or not. Class 1 — the patient has cancer Class 0 — the patient does not have cancer. The goal is to correctly classify the above data and reduce the wrongly identified data as much as possible.

set.seed(1) actual_data = c("actual = 0","actual = 1")[runif(100,1,3)] # Actual data points predicted_data = actual_data predicted_data[runif(20,1,100)] = actual_data[runif(20,1,100)] # Predicted data points

Step 3 - Create a Confusion Matrix

**Confusion matrix**: Confusion matrix is a performance metric technique for summarizing the performance of a classification algorithm. The number of correct and incorrect predictions are summarized with count values and listed down by each class of predicted and actual values It gives you insight not only into the errors made by your classifier but, more importantly, the types of errors that have been made. TN:- Actually, the patient doesn't have cancer and for which we predict the patient not to have cancer.
TP:- Actually, the patient has cancer and for which we predict the patient has cancer.
FP :-Predict the patient has cancer, but actually does not have cancer.
FN:- Predict the patient does not have cancer, but predicts that the patient has cancer.

confusion_mat = as.matrix(table(Actual_Values = actual_data, Predicted_Values = predicted_data)) print(confusion_mat) accuracy = (49+43)/(49+43+3+5) accuracy

"Output of code :" 0.92 

Step 4 - Confusion matrix using the 'caret' package

confusionMatrix(factor(predicted_data),factor(actual_data))

Conclusion : Here, the accuracy is 0.92 or 92% (92 out of 100 data example points, were predicted correctly). From our confusion matrix, we can see that the False Negatives, FN — 5 i.e Patients that have cancer but are predicted as not having cancer. This should be reduced as the patient won't be receiving treatment due to false diagnosis. But, accuracy alone doesn't provide sufficient information about the performance of our model, hence other performance metrics like precision and recall must be considered.

{"mode":"full","isActive":false}

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

NLP Project for Beginners on Text Processing and Classification
This Project Explains the Basic Text Preprocessing and How to Build a Classification Model in Python

Build a Speech-Text Transcriptor with Nvidia Quartznet Model
In this Deep Learning Project, you will leverage transfer learning from Nvidia QuartzNet pre-trained models to develop a speech-to-text transcriptor.

House Price Prediction Project using Machine Learning in Python
Use the Zillow Zestimate Dataset to build a machine learning model for house price prediction.

Multi-Class Text Classification with Deep Learning using BERT
In this deep learning project, you will implement one of the most popular state of the art Transformer models, BERT for Multi-Class Text Classification

Build a Logistic Regression Model in Python from Scratch
Regression project to implement logistic regression in python from scratch on streaming app data.

Many-to-One LSTM for Sentiment Analysis and Text Generation
In this LSTM Project , you will build develop a sentiment detection model using many-to-one LSTMs for accurate prediction of sentiment labels in airline text reviews. Additionally, we will also train many-to-one LSTMs on 'Alice's Adventures in Wonderland' to generate contextually relevant text.

CycleGAN Implementation for Image-To-Image Translation
In this GAN Deep Learning Project, you will learn how to build an image to image translation model in PyTorch with Cycle GAN.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Build a Multi Touch Attribution Machine Learning Model in Python
Identifying the ROI on marketing campaigns is an essential KPI for any business. In this ML project, you will learn to build a Multi Touch Attribution Model in Python to identify the ROI of various marketing efforts and their impact on conversions or sales..

Classification Projects on Machine Learning for Beginners - 1
Classification ML Project for Beginners - A Hands-On Approach to Implementing Different Types of Classification Algorithms in Machine Learning for Predictive Modelling