Extratree classsifer used multiple tree for building classification model. For feature selection, one of the most powerful tool is extra tree classifier. It works best in presence of noisy features.
So this recipe is a short example on how does extratree classifer works. Let's get started.
import pandas as pd import numpy as np from sklearn.ensemble import ExtraTreesClassifier from sklearn.datasets import load_iris
Let's pause and look at these imports. Numpy and Pandas are the usual ones. sklearn.ensemble contains Extra Tree Classifer classification model. Here sklearn.dataset is used to import one classification based model dataset.
X,y=load_iris(return_X_y=True) print(X) print(y)
Here, we have used load_iris function to import our dataset in two list form (X and y) and therefore kept return_X_y to be True.
Now our dataset is ready
Before we do that, let's look at the important parameters that we need to pass.
It decides the number of trees in the forest.
The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain.
It decides the number of features to consider when looking for the best split.
Now that we understand, let's create the object
extra_tree_forest = ExtraTreesClassifier(n_estimators = 5,criterion ='entropy', max_features = 2)
extra_tree_forest.fit(X, y) feature_importance = extra_tree_forest.feature_importances_ print(feature_importance)
Here we have simply fit used fit function to fit our model on X and y. There after, we are trying to understand the importance of each feature based on model we built.
Once we run the above code snippet, we will see:
Scroll down the ipython file to have a look at the results.