How to parallalise execution of XGBoost and cross validation in Python?

This recipe helps you parallalise execution of XGBoost and cross validation in Python

Recipe Objective

Have you ever tried to parallaise function and calculate the computational time or running time of a model?

So this recipe is a short example of how we can parallalise execution of XGBoost and cross validation in Python.

Learn How to use XLNet for Text Classification

Step 1 - Import the library

import time from sklearn import datasets from sklearn.model_selection import train_test_split, cross_val_score from xgboost import XGBClassifier

Here we have imported various modules like time, datasets, XGBClassifier and test_train_split from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.

Step 2 - Setup the Data

Here we have used datasets to load the inbuilt wine dataset and we have created objects X and y to store the data and the target value respectively. dataset = datasets.load_wine() X = dataset.data; y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

Step 3 - Single Thread XGBoost and Parallel Thread CV

Here, we are using XGBClassifier as a Machine Learning model to fit the data and cross validation score. In the model we have passed nthread equals to 1. We are using time library to compute the time. start = time.time() model = XGBClassifier(nthread=1) results = cross_val_score(model, X, y, cv=10, scoring="neg_log_loss", n_jobs=-1) elapsed = time.time() - start print("Single Thread XGBoost, Parallel Thread CV: %f" % (elapsed))

Step 4 - Thread XGBoost and Single Thread CV

Here, we are using XGBClassifier as a Machine Learning model to fit the data and cross validation score. In the model we have passed nthread equals to -1. We are using time library to compute the time. start = time.time() model = XGBClassifier(nthread=-1) results = cross_val_score(model, X, y, cv=10, scoring="neg_log_loss", n_jobs=1) elapsed = time.time() - start print("Parallel Thread XGBoost, Single Thread CV: %f" % (elapsed))

Step 5 - Thread XGBoost and CV

Here, we are using XGBClassifier as a Machine Learning model to fit the data and cross validation score. In the model we have passed nthread equals to -1. We are using time library to compute the time. start = time.time() model = XGBClassifier(nthread=-1) results = cross_val_score(model, X, y, cv=10, scoring="neg_log_loss", n_jobs=-1) elapsed = time.time() - start print("Parallel Thread XGBoost and CV: %f" % (elapsed))

As an output we get:

Single Thread XGBoost, Parallel Thread CV: 3.380478
Parallel Thread XGBoost, Single Thread CV: 2.431405
Parallel Thread XGBoost and CV: 0.197474

Download Materials

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

MLOps Project for a Mask R-CNN on GCP using uWSGI Flask
MLOps on GCP - Solved end-to-end MLOps Project to deploy a Mask RCNN Model for Image Segmentation as a Web Application using uWSGI Flask, Docker, and TensorFlow.

PyTorch Project to Build a GAN Model on MNIST Dataset
In this deep learning project, you will learn how to build a GAN Model on MNIST Dataset for generating new images of handwritten digits.

Insurance Pricing Forecast Using XGBoost Regressor
In this project, we are going to talk about insurance forecast by using linear and xgboost regression techniques.

Recommender System Machine Learning Project for Beginners-3
Content Based Recommender System Project - Building a Content-Based Product Recommender App with Streamlit

Build Deep Autoencoders Model for Anomaly Detection in Python
In this deep learning project , you will build and deploy a deep autoencoders model using Flask.

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

NLP Project for Beginners on Text Processing and Classification
This Project Explains the Basic Text Preprocessing and How to Build a Classification Model in Python

Locality Sensitive Hashing Python Code for Look-Alike Modelling
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.

Build a Multi-Class Classification Model in Python on Saturn Cloud
In this machine learning classification project, you will build a multi-class classification model in Python on Saturn Cloud to predict the license status of a business.

Build Portfolio Optimization Machine Learning Models in R
Machine Learning Project for Financial Risk Modelling and Portfolio Optimization with R- Build a machine learning model in R to develop a strategy for building a portfolio for maximized returns.