How does Linear Discriminant Analysis work?

How does Linear Discriminant Analysis work?

How does Linear Discriminant Analysis work?

This recipe explains how Linear Discriminant Analysis work

Recipe Objective

Linear Discriminant Analysis is a classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes' rule. It fits a Gaussian density to each class, assuming that all classes share the same covariance matrix.

So this recipe is a short example on how does Linear Discriminant Analysis work. Let's get started.

Step 1 - Import the library

from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.datasets import load_iris from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

Let's pause and look at these imports. We have exported train_test_split which helps in randomly breaking the datset in two parts. Here sklearn.dataset is used to import one classification based model dataset. Also, we have exported LinearDiscriminantAnalysis to build our model.

Step 2 - Setup the Data

X,y=load_iris(return_X_y=True) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Here, we have used load_iris function to import our dataset in two list form (X and y) and therefore kept return_X_y to be True. Further with have broken down the dataset into 2 parts, train and test with ratio 3:4.

Now our dataset is ready.

Step 3 - Building the model

model = LinearDiscriminantAnalysis()

We have simply built a classification model with LinearDiscriminantAnalysis with default values.

Step 4 - Fit the model and predict for test set, y_train) y_pred= model.predict(X_test)

Here we have simply fit used fit function to fit our model on X_train and y_train. Now, we are predicting the values of X_test using our built model.

Step 5 - Printing the accuracy

print(model.score(X_train,y_train)) print(model.score(X_test,y_test))

Here we have calculated accuracy score using score function for both our train and test set.

Step 6 - Lets look at our dataset now

Once we run the above code snippet, we will see:


Clearly, the model built for the given datset is efficient on any unknown set.

Relevant Projects

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Image Segmentation using Mask R-CNN with Tensorflow
In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Human Activity Recognition Using Multiclass Classification in Python
In this human activity recognition project, we use multiclass classification machine learning techniques to analyse fitness dataset from a smartphone tracker.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.