How does extratrees classifier work in ML in python

This recipe explains how does extratrees classifier work in ML in python

Recipe Objective

Extratree classsifer used multiple tree for building classification model. For feature selection, one of the most powerful tool is extra tree classifier. It works best in presence of noisy features.

So this recipe is a short example on how does extratree classifer works. Let's get started.

Step 1 - Import the library

import pandas as pd import numpy as np from sklearn.ensemble import ExtraTreesClassifier from sklearn.datasets import load_iris

Let's pause and look at these imports. Numpy and Pandas are the usual ones. sklearn.ensemble contains Extra Tree Classifer classification model. Here sklearn.dataset is used to import one classification based model dataset.

Step 2 - Setup the Data

X,y=load_iris(return_X_y=True) print(X) print(y)

Here, we have used load_iris function to import our dataset in two list form (X and y) and therefore kept return_X_y to be True.

Now our dataset is ready

Step 3 - Building the model

Before we do that, let's look at the important parameters that we need to pass.

1) n_estimators
It decides the number of trees in the forest.

2) criterion
The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain.

3) max_features
It decides the number of features to consider when looking for the best split.

Now that we understand, let's create the object

extra_tree_forest = ExtraTreesClassifier(n_estimators = 5,criterion ='entropy', max_features = 2)
  • As you can see, we have set n_estimator to be 5
  • Criterion is set to be entropy
  • Max features is set here to be 2

Step 4 - Fit the model and find results, y) feature_importance = extra_tree_forest.feature_importances_ print(feature_importance)

Here we have simply fit used fit function to fit our model on X and y. There after, we are trying to understand the importance of each feature based on model we built.

Step 5 - Lets look at our dataset now

Once we run the above code snippet, we will see:

Scroll down the ipython file to have a look at the results.

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Hands-On Approach to Causal Inference in Machine Learning
In this Machine Learning Project, you will learn to implement various causal inference techniques in Python to determine, how effective the sprinkler is in making the grass wet.

Medical Image Segmentation Deep Learning Project
In this deep learning project, you will learn to implement Unet++ models for medical image segmentation to detect and classify colorectal polyps.

Build a Multi Class Image Classification Model Python using CNN
This project explains How to build a Sequential Model that can perform Multi Class Image Classification in Python using CNN

Build Multi Class Text Classification Models with RNN and LSTM
In this Deep Learning Project, you will use the customer complaints data about consumer financial products to build multi-class text classification models using RNN and LSTM.

BERT Text Classification using DistilBERT and ALBERT Models
This Project Explains how to perform Text Classification using ALBERT and DistilBERT

Learn to Build an End-to-End Machine Learning Pipeline - Part 1
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, addressing a major challenge in the logistics industry.

Build a Hybrid Recommender System in Python using LightFM
In this Recommender System project, you will build a hybrid recommender system in Python using LightFM .

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

Azure Deep Learning-Deploy RNN CNN models for TimeSeries
In this Azure MLOps Project, you will learn to perform docker-based deployment of RNN and CNN Models for Time Series Forecasting on Azure Cloud.