How to use LightGBM Classifier and Regressor in Python?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to use LightGBM Classifier and Regressor in Python?

How to use LightGBM Classifier and Regressor in Python?

This recipe helps you use LightGBM Classifier and Regressor in Python

Recipe Objective

We have worked on various models and used them to predict the output. Here is one such model that is LightGBM which is an important model and can be used as Regressor and Classifier.

So this is the recipe on how we can use LightGBM Classifier and Regressor.

Step 1 - Import the library

from sklearn import datasets from sklearn import metrics from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt import seaborn as sns plt.style.use('ggplot') import lightgbm as ltb

We have imported all the modules that would be needed like metrics, datasets, ltb, train_test_split etc. We will see the use of each modules step by step further.

Step 2 - Setting up the Data for Classifier

We have imported inbuilt wine dataset from the module datasets and stored the data in X and the target in y. We have also used train_test_split to split the dataset into two parts such that 30% of data is in test and rest in train. dataset = datasets.load_wine() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

Step 3 - Using LightGBM Classifier and calculating the scores

We have made an object for the model and fitted the train data. Then we have used the test data to test the model by predicting the output from the model for test data. model = ltb.LGBMClassifier() model.fit(X_train, y_train) print(); print(model) expected_y = y_test predicted_y = model.predict(X_test)

Now We are calcutaing other scores for the model using classification_report and confusion matrix by passing expected and predicted values of target of test set. print(metrics.classification_report(expected_y, predicted_y)) print(metrics.confusion_matrix(expected_y, predicted_y))

Step 4 - Setting up the Data for Regressor

We have imported inbuilt boston dataset from the module datasets and stored the data in X and the target in y. We have also used train_test_split to split the dataset into two parts such that 30% of data is in test and rest in train. dataset = datasets.load_boston() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

Step 5 - Using LightGBM Regressor and calculating the scores

We have made an object for the model and fitted the train data. Then we have used the test data to test the model by predicting the output from the model for test data. model = ltb.LGBMRegressor() model.fit(X_train, y_train) print(); print(model) expected_y = y_test predicted_y = model.predict(X_test)

Now We are calcutaing other scores for the model using r_2 score and mean_squared_log_error by passing expected and predicted values of target of test set. print(metrics.r2_score(expected_y, predicted_y)) print(metrics.mean_squared_log_error(expected_y, predicted_y))

Step 6 - Ploting the model

We are ploting the regressor model: plt.figure(figsize=(10,10)) sns.regplot(expected_y, predicted_y, fit_reg=True, scatter_kws={"s": 100}) So the final output comes as:

LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=1.0,
        importance_type='split', learning_rate=0.1, max_depth=-1,
        min_child_samples=20, min_child_weight=0.001, min_split_gain=0.0,
        n_estimators=100, n_jobs=-1, num_leaves=31, objective=None,
        random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=True,
        subsample=1.0, subsample_for_bin=200000, subsample_freq=0)

              precision    recall  f1-score   support

           0       0.94      1.00      0.97        15
           1       1.00      0.95      0.98        22
           2       1.00      1.00      1.00        17

   micro avg       0.98      0.98      0.98        54
   macro avg       0.98      0.98      0.98        54
weighted avg       0.98      0.98      0.98        54


[[15  0  0]
 [ 1 21  0]
 [ 0  0 17]]

LGBMRegressor(boosting_type='gbdt', class_weight=None, colsample_bytree=1.0,
       importance_type='split', learning_rate=0.1, max_depth=-1,
       min_child_samples=20, min_child_weight=0.001, min_split_gain=0.0,
       n_estimators=100, n_jobs=-1, num_leaves=31, objective=None,
       random_state=None, reg_alpha=0.0, reg_lambda=0.0, silent=True,
       subsample=1.0, subsample_for_bin=200000, subsample_freq=0)

0.8079584681584322

0.02837589562421279

Download Materials

Relevant Projects

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Census Income Data Set Project - Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based on census data.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.