How to classify wine using sklearn ensemble Bagging model in ML in python

This recipe helps you classify wine using sklearn ensemble Bagging model in ML in python
Last Updated: 26 Jul 2022

Get access to Data Science projects View all Data Science projects

MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET ALL TAGS

Recipe Objective

Have you ever tried to use Ensemble models like Bagging Classifier, Extra Tree Classifier and Random Forest Classifier for Analysis. In this we will using both for different dataset.

So this recipe is a short example of how we can classify "wine" using sklearn ensemble (Bagging) model - Multiclass Classification.

A Gentle Introduction to Ensemble Learning in Machine Learning

Recipe Objective

Step 1 - Import the library

from sklearn import datasets from sklearn import metrics from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt plt.style.use("ggplot") from sklearn import ensemble

Here we have imported various modules like datasets, mertics, ensemble and test_train_split from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.

Step 2 - Setup the Data

Here we have used datasets to load the inbuilt wine dataset and we have created objects X and y to store the data and the target value respectively. dataset = datasets.load_wine() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Step 3 - Model and its Score

Here, we are using Bagging Classifier as a Machine Learning model to fit the data. model = ensemble.BaggingClassifier() model.fit(X_train, y_train) print(model) Now we have predicted the output by passing X_test and also stored real target in expected_y. expected_y = y_test predicted_y = model.predict(X_test) Here we have printed classification report and confusion matrix for the classifier. print(metrics.classification_report(expected_y, predicted_y, target_names=dataset.target_names)) print(metrics.confusion_matrix(expected_y, predicted_y))

Step 4 - Model and its Score

Here, we are using Extra Tree Classifier as a Machine Learning model to fit the data. model = ensemble.ExtraTreesClassifier() model.fit(X_train, y_train) print(model) Now we have predicted the output by passing X_test and also stored real target in expected_y. expected_y = y_test predicted_y = model.predict(X_test) Here we have printed classification report and confusion matrix for the Regressor. print(metrics.classification_report(expected_y, predicted_y, target_names=dataset.target_names)) print(metrics.confusion_matrix(expected_y, predicted_y))

Step 5 - Model and its Score

Here, we are using Random Forest Classifier as a Machine Learning model to fit the data. model = ensemble.RandomForestClassifier() model.fit(X_train, y_train) print(model) Now we have predicted the output by passing X_test and also stored real target in expected_y. expected_y = y_test predicted_y = model.predict(X_test) Here we have printed classification report and confusion matrix for the Regressor. print(metrics.classification_report(expected_y, predicted_y, target_names=dataset.target_names)) print(metrics.confusion_matrix(expected_y, predicted_y)) As an output we get:

BaggingClassifier(base_estimator=None, bootstrap=True,
         bootstrap_features=False, max_features=1.0, max_samples=1.0,
         n_estimators=10, n_jobs=None, oob_score=False, random_state=None,
         verbose=0, warm_start=False)

ensemble.BaggingClassifier(): 

              precision    recall  f1-score   support

     class_0       1.00      0.93      0.96        14
     class_1       0.95      0.95      0.95        21
     class_2       0.91      1.00      0.95        10

   micro avg       0.96      0.96      0.96        45
   macro avg       0.95      0.96      0.96        45
weighted avg       0.96      0.96      0.96        45


[[13  1  0]
 [ 0 20  1]
 [ 0  0 10]]

ExtraTreesClassifier(bootstrap=False, class_weight=None, criterion="gini",
           max_depth=None, max_features="auto", max_leaf_nodes=None,
           min_impurity_decrease=0.0, min_impurity_split=None,
           min_samples_leaf=1, min_samples_split=2,
           min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=None,
           oob_score=False, random_state=None, verbose=0, warm_start=False)

ensemble.ExtraTreesClassifier(): 

              precision    recall  f1-score   support

     class_0       0.93      1.00      0.97        14
     class_1       1.00      0.90      0.95        21
     class_2       0.91      1.00      0.95        10

   micro avg       0.96      0.96      0.96        45
   macro avg       0.95      0.97      0.96        45
weighted avg       0.96      0.96      0.96        45


[[14  0  0]
 [ 1 19  1]
 [ 0  0 10]]

RandomForestClassifier(bootstrap=True, class_weight=None, criterion="gini",
            max_depth=None, max_features="auto", max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=None,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

ensemble.RandomForestClassifier(): 

              precision    recall  f1-score   support

     class_0       1.00      0.93      0.96        14
     class_1       0.95      1.00      0.98        21
     class_2       1.00      1.00      1.00        10

   micro avg       0.98      0.98      0.98        45
   macro avg       0.98      0.98      0.98        45
weighted avg       0.98      0.98      0.98        45


[[13  1  0]
 [ 0 21  0]
 [ 0  0 10]]

Download Materials

iPython Notebook

What Users are saying..

Savvy Sahai

Data Science Intern, Capgemini

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Azure Text Analytics for Medical Search Engine Deployment

Microsoft Azure Project - Use Azure text analytics cognitive service to deploy a machine learning model into Azure Databricks

View Project Details

Learn to Build an End-to-End Machine Learning Pipeline - Part 1

In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, addressing a major challenge in the logistics industry.

View Project Details

Machine Learning Project to Forecast Rossmann Store Sales

In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

View Project Details

Build a Customer Churn Prediction Model using Decision Trees

Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

View Project Details

A/B Testing Approach for Comparing Performance of ML Models

The objective of this project is to compare the performance of BERT and DistilBERT models for building an efficient Question and Answering system. Using A/B testing approach, we explore the effectiveness and efficiency of both models and determine which one is better suited for Q&A tasks.

View Project Details

Walmart Sales Forecasting Data Science Project

Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

View Project Details

Build CI/CD Pipeline for Machine Learning Projects using Jenkins

In this project, you will learn how to create a CI/CD pipeline for a search engine application using Jenkins.

View Project Details

Ecommerce product reviews - Pairwise ranking and sentiment analysis

This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

View Project Details

Machine Learning project for Retail Price Optimization

In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

View Project Details

Skip Gram Model Python Implementation for Word Embeddings

Skip-Gram Model word2vec Example -Learn how to implement the skip gram algorithm in NLP for word embeddings on a set of documents.

View Project Details

How to classify wine using sklearn ensemble Bagging model in ML in python

Recipe Objective

Table of Contents

Step 1 - Import the library

Step 2 - Setup the Data

Step 3 - Model and its Score

Step 4 - Model and its Score

Step 5 - Model and its Score

Savvy Sahai

Relevant Projects

You might also like

Relevant Projects