How to parallalise execution of XGBoost and cross validation in Python?

This recipe helps you parallalise execution of XGBoost and cross validation in Python

Recipe Objective

Have you ever tried to parallaise function and calculate the computational time or running time of a model?

So this recipe is a short example of how we can parallalise execution of XGBoost and cross validation in Python.

Learn How to use XLNet for Text Classification

Step 1 - Import the library

import time from sklearn import datasets from sklearn.model_selection import train_test_split, cross_val_score from xgboost import XGBClassifier

Here we have imported various modules like time, datasets, XGBClassifier and test_train_split from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.

Step 2 - Setup the Data

Here we have used datasets to load the inbuilt wine dataset and we have created objects X and y to store the data and the target value respectively. dataset = datasets.load_wine() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Step 3 - Single Thread XGBoost and Parallel Thread CV

Here, we are using XGBClassifier as a Machine Learning model to fit the data and cross validation score. In the model we have passed nthread equals to 1. We are using time library to compute the time. start = time.time() model = XGBClassifier(nthread=1) results = cross_val_score(model, X, y, cv=10, scoring="neg_log_loss", n_jobs=-1) elapsed = time.time() - start print("Single Thread XGBoost, Parallel Thread CV: %f" % (elapsed))

Step 4 - Thread XGBoost and Single Thread CV

Here, we are using XGBClassifier as a Machine Learning model to fit the data and cross validation score. In the model we have passed nthread equals to -1. We are using time library to compute the time. start = time.time() model = XGBClassifier(nthread=-1) results = cross_val_score(model, X, y, cv=10, scoring="neg_log_loss", n_jobs=1) elapsed = time.time() - start print("Parallel Thread XGBoost, Single Thread CV: %f" % (elapsed))

Step 5 - Thread XGBoost and CV

Here, we are using XGBClassifier as a Machine Learning model to fit the data and cross validation score. In the model we have passed nthread equals to -1. We are using time library to compute the time. start = time.time() model = XGBClassifier(nthread=-1) results = cross_val_score(model, X, y, cv=10, scoring="neg_log_loss", n_jobs=-1) elapsed = time.time() - start print("Parallel Thread XGBoost and CV: %f" % (elapsed))

As an output we get:

Single Thread XGBoost, Parallel Thread CV: 3.380478
Parallel Thread XGBoost, Single Thread CV: 2.431405
Parallel Thread XGBoost and CV: 0.197474

Download Materials

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Hands-On Approach to Regression Discontinuity Design Python
In this machine learning project, you will learn to implement Regression Discontinuity Design Example in Python to determine the effect of age on Mortality Rate in Python.

Machine Learning Project to Forecast Rossmann Store Sales
In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

Build a CNN Model with PyTorch for Image Classification
In this deep learning project, you will learn how to build an Image Classification Model using PyTorch CNN

Build Deep Autoencoders Model for Anomaly Detection in Python
In this deep learning project , you will build and deploy a deep autoencoders model using Flask.

Azure Text Analytics for Medical Search Engine Deployment
Microsoft Azure Project - Use Azure text analytics cognitive service to deploy a machine learning model into Azure Databricks

MLOps AWS Project on Topic Modeling using Gunicorn Flask
In this project we will see the end-to-end machine learning development process to design, build and manage reproducible, testable, and evolvable machine learning models by using AWS

Build CNN for Image Colorization using Deep Transfer Learning
Image Processing Project -Train a model for colorization to make grayscale images colorful using convolutional autoencoders.

Deploy Transformer-BART Model on Paperspace Cloud
In this MLOps Project you will learn how to deploy a Tranaformer BART Model for Abstractive Text Summarization on Paperspace Private Cloud

Build an Image Segmentation Model using Amazon SageMaker
In this Machine Learning Project, you will learn to implement the UNet Architecture and build an Image Segmentation Model using Amazon SageMaker