How does Linear Discriminant Analysis work?

How does Linear Discriminant Analysis work?

How does Linear Discriminant Analysis work?

This recipe explains how Linear Discriminant Analysis work


Recipe Objective

Linear Discriminant Analysis is a classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes' rule. It fits a Gaussian density to each class, assuming that all classes share the same covariance matrix.

So this recipe is a short example on how does Linear Discriminant Analysis work. Let's get started.

Step 1 - Import the library

from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.datasets import load_iris from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

Let's pause and look at these imports. We have exported train_test_split which helps in randomly breaking the datset in two parts. Here sklearn.dataset is used to import one classification based model dataset. Also, we have exported LinearDiscriminantAnalysis to build our model.

Step 2 - Setup the Data

X,y=load_iris(return_X_y=True) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Here, we have used load_iris function to import our dataset in two list form (X and y) and therefore kept return_X_y to be True. Further with have broken down the dataset into 2 parts, train and test with ratio 3:4.

Now our dataset is ready.

Step 3 - Building the model

model = LinearDiscriminantAnalysis()

We have simply built a classification model with LinearDiscriminantAnalysis with default values.

Step 4 - Fit the model and predict for test set, y_train) y_pred= model.predict(X_test)

Here we have simply fit used fit function to fit our model on X_train and y_train. Now, we are predicting the values of X_test using our built model.

Step 5 - Printing the accuracy

print(model.score(X_train,y_train)) print(model.score(X_test,y_test))

Here we have calculated accuracy score using score function for both our train and test set.

Step 6 - Lets look at our dataset now

Once we run the above code snippet, we will see:


Clearly, the model built for the given datset is efficient on any unknown set.

Relevant Projects

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Music Recommendation System Project using Python and R
Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.