How to implement voting ensemble in Python?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to implement voting ensemble in Python?

How to implement voting ensemble in Python?

This recipe helps you implement voting ensemble in Python

0

Recipe Objective

How do you select which model to use for a dataset. We can do this by voting ensemble which trains on an ensemble of numerous models and predicts an output (class) based on their highest probability of chosen class as the output

So this is the recipe on how we can implement voting ensemble in Python.

Step 1 - Import the library

from sklearn import model_selection from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.svm import SVC from sklearn.ensemble import VotingClassifier from sklearn import datasets from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt plt.style.use("ggplot")

We have imported various models like LogisticRegression, DecisionTreeClassifier, SVC and VotingClassifier.

Step 2 - Setting up the Data

We have imported Wine dataset and stored the data in X and the target in y. We have used test_train_split to split the data. We have also used model_selection.KFold to split the data. seed = 42 dataset = datasets.load_wine() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30) kfold = model_selection.KFold(n_splits=10, random_state=seed)

Step 3 - Selecting model by Voting Classifier

We have made an array named estimators with all the models from e=which we want to select. Now we have used VotingClassifier with parameter as extimator which contain all the models. Finally we have calculated cross validation score of the model. estimators = [] model1 = LogisticRegression(); estimators.append(("logistic", model1)) model2 = DecisionTreeClassifier(); estimators.append(("cart", model2)) model3 = SVC(); estimators.append(("svm", model3)) ensemble = VotingClassifier(estimators) results = model_selection.cross_val_score(ensemble, X_train, y_train, cv=kfold) print(results.mean()) So the output comes as

0.9102564102564104

Relevant Projects

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Human Activity Recognition Using Multiclass Classification in Python
In this human activity recognition project, we use multiclass classification machine learning techniques to analyse fitness dataset from a smartphone tracker.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.