How to use NaiveBayes Classifier?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to use NaiveBayes Classifier?

How to use NaiveBayes Classifier?

This recipe helps you use NaiveBayes Classifier

0

Recipe Objective

Naive Bayes classifiers are a collection of classification algorithms based on Bayes' Theorem. It is not a single algorithm but a family of algorithms where all of them share a common principle, i.e. every pair of features being classified is independent of each other.

So this recipe is a short example on how to use NaiveBayes Classifier. Let's get started.

Step 1 - Import the library

from sklearn import datasets from sklearn import metrics from sklearn.model_selection import train_test_split from sklearn.datasets import load_iris from sklearn.naive_bayes import GaussianNB

Let's pause and look at these imports. We have exported train_test_split which helps in randomly breaking the datset in two parts. Here sklearn.dataset is used to import one classification based model dataset. Also, we have exported Guassian Naive Bays library to build our model.

Step 2 - Setup the Data

X,y=load_iris(return_X_y=True) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Here, we have used load_iris function to import our dataset in two list form (X and y) and therefore kept return_X_y to be True. Further with have broken down the dataset into 2 parts, train and test with ratio 3:4.

Now our dataset is ready.

Step 3 - Building the model

model = GaussianNB()

We have simply built a classification model with GaussianNB with default values.

Step 4 - Fit the model and predict for test set

model.fit(X_train, y_train) y_pred= model.predict(X_test)

Here we have simply fit used fit function to fit our model on X_train and y_train. Now, we are predicting the values of X_test using our built model.

Step 5 - Printing the accuracy

print(metrics.accuracy_score(y_test, y_pred)*100)

Here we have calculated accuracy score using matrics library

Step 6 - Lets look at our dataset now

Once we run the above code snippet, we will see:

97.36842105263158

Clearly, the model built for the given datset in highly efficient.

Relevant Projects

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.