How to compare sklearn classification algorithms in Python?

How to compare sklearn classification algorithms in Python?

How to compare sklearn classification algorithms in Python?

This recipe helps you compare sklearn classification algorithms in Python

Recipe Objective

How you decide which machine learning model to use on a dataset. Randomly applying any model and testing can be a hectic process. So here we will try to apply many models at once and compare each model.

So this is the recipe on how we can compare sklearn classification algorithms in Python.

Step 1 - Import the library

import matplotlib.pyplot as plt from sklearn import model_selection from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.neighbors import KNeighborsClassifier from sklearn.discriminant_analysis import LinearDiscriminantAnalysis from sklearn.naive_bayes import GaussianNB from sklearn.svm import SVC from sklearn.model_selection import train_test_split from sklearn import datasets import matplotlib.pyplot as plt'ggplot')

We have imported all the models on which we want to train the data. Other than that we have imported many other modules which will be required.

Step 2 - Loading the Dataset

We are using inbuilt wine dataset and stored data in X and target in Y. We are also using test_train_split to split the dataset. We have also created an object seed which we have passed in Kfold in the paremeter random_state. seed = 50 dataset = datasets.load_wine() X =; y = X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30) kfold = model_selection.KFold(n_splits=10, random_state=seed)

Step 3 - Loading all Models

Here we have created and empty array and then appended it with all the models like LogisticRegression, DecisionTreeClassifier, GaussianNB and many more. models = [] models.append(('LR', LogisticRegression())) models.append(('LDA', LinearDiscriminantAnalysis())) models.append(('KNN', KNeighborsClassifier())) models.append(('CART', DecisionTreeClassifier())) models.append(('NB', GaussianNB())) models.append(('SVM', SVC()))

Step 4 - Evaluating the models

Here we have created two empty array named results and names and an object scoring. Now we have made a for loop which will itterate over all the models, In the loop we have used the function Kfold and cross validation score with the desired parameters. Finally we have used a print statement to print the result for all the models. results = [] names = [] scoring = 'accuracy' for name, model in models: kfold = model_selection.KFold(n_splits=10, random_state=seed) cv_results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring) results.append(cv_results) names.append(name) msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std()) print(msg)

Step 5 - Ploting BoxPlot

We have also ploted Box Plot to clearly visualize the result. fig = plt.figure(figsize=(10,10)) fig.suptitle('How to compare sklearn classification algorithms') ax = fig.add_subplot(111) plt.boxplot(results) ax.set_xticklabels(names) So the output comes as

LR: 0.960256 (0.039806)
LDA: 0.984615 (0.030769)
KNN: 0.711538 (0.123736)
CART: 0.889103 (0.086955)
NB: 0.951282 (0.064499)
SVM: 0.434615 (0.105752)

Download Materials

Relevant Projects

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Machine learning for Retail Price Recommendation with Python
Use the Mercari Dataset with dynamic pricing to build a price recommendation algorithm using machine learning in Python to automatically suggest the right product prices.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

Time Series LSTM forecasting
In this project, we will use time-series forecasting to predict the values of a sensor using multiple dependent variables. A variety of machine learning models are applied in this task of time series forecasting. We will see a comparison between the LSTM, ARIMA and Regression models. Classical forecasting methods like ARIMA are still popular and powerful but they lack the overall generalizability that memory-based models like LSTM offer. Every model has its own advantages and disadvantages and that will be discussed. The main objective of this article is to lead you through building a working LSTM model and it's different variants such as Vanilla, Stacked, Bidirectional, etc. There will be special focus on customized data preparation for LSTM.

Medical Image Segmentation Deep Learning Project
In this deep learning project, you will learn to implement Unet++ models for medical image segmentation to detect and classify colorectal polyps.

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

Classification of T shirt images to see if they have text on them
Want to search images of clothes which have text on them? Then this project talks through how we can classify an image whether it has text on it or not. For this we use state of the model called as inception and try and deepdive into how it works on our dataset

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

Time Series Analysis Project in R on Stock Market forecasting
In this time series project, you will build a model to predict the stock prices and identify the best time series forecasting model that gives reliable and authentic results for decision making.