How to use RandomForest Classifier and Regressor in Python?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to use RandomForest Classifier and Regressor in Python?

How to use RandomForest Classifier and Regressor in Python?

This recipe helps you use RandomForest Classifier and Regressor in Python

Recipe Objective

Have you ever tried to use RandomForest models ie. regressor or classifier. In this we will using both for different dataset.

So this recipe is a short example of how we can use RandomForest Classifier and Regressor in Python.

Step 1 - Import the library

from sklearn import datasets from sklearn import metrics from sklearn.ensemble import RandomForestClassifier from sklearn.ensemble import RandomForestRegressor from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt import seaborn as sns plt.style.use("ggplot")

Here we have imported various modules like datasets, RandomForest and test_train_split from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.

Step 2 - Setup the Data for classifier

Here we have used datasets to load the inbuilt iris dataset and we have created objects X and y to store the data and the target value respectively. dataset = datasets.load_wine() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Step 3 - Model and its Score

Here, we are using RandomForestClassifier as a Machine Learning model to fit the data. model = RandomForestClassifier() model.fit(X_train, y_train) print(); print(model) Now we have predicted the output by passing X_test and also stored real target in expected_y. expected_y = y_test predicted_y = model.predict(X_test) Here we have printed classification report and confusion matrix for the classifier. print(metrics.classification_report(expected_y, predicted_y)) print(metrics.confusion_matrix(expected_y, predicted_y))

Step 4 - Setup the Data for regressor

Here we have used datasets to load the inbuilt boston dataset and we have created objects X and y to store the data and the target value respectively. dataset = datasets.load_boston() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Step 5 - Model and its Score

Here, we are using RandomForestRegressor as a Machine Learning model to fit the data. model_RFR = RandomForestRegressor() model_RFR.fit(X_train, y_train) print(); print(model_RFR) Now we have predicted the output by passing X_test and also stored real target in expected_y. expected_y = y_test predicted_y = model_RFR.predict(X_test) Here we have printed r2 score and mean squared log error for the Regressor. print(metrics.r2_score(expected_y, predicted_y)) print(metrics.mean_squared_log_error(expected_y, predicted_y)) plt.figure(figsize=(10,10)) sns.regplot(expected_y, predicted_y, fit_reg=True, scatter_kws={"s": 100})

As an output we get:

RandomForestClassifier(bootstrap=True, class_weight=None, criterion="gini",
            max_depth=None, max_features="auto", max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=None,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        14
           1       1.00      0.95      0.97        20
           2       0.92      1.00      0.96        11

   micro avg       0.98      0.98      0.98        45
   macro avg       0.97      0.98      0.98        45
weighted avg       0.98      0.98      0.98        45


[[14  0  0]
 [ 0 19  1]
 [ 0  0 11]]

RandomForestRegressor(bootstrap=True, criterion="mse", max_depth=None,
           max_features="auto", max_leaf_nodes=None,
           min_impurity_decrease=0.0, min_impurity_split=None,
           min_samples_leaf=1, min_samples_split=2,
           min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=None,
           oob_score=False, random_state=None, verbose=0, warm_start=False)

0.8609661819993468

0.02760579419028312

Download Materials

Relevant Projects

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Image Segmentation using Mask R-CNN with Tensorflow
In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.

Machine learning for Retail Price Recommendation with Python
Use the Mercari Dataset with dynamic pricing to build a price recommendation algorithm using machine learning in Python to automatically suggest the right product prices.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.