How to evaluate XGBoost model with learning curves example 2?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to evaluate XGBoost model with learning curves example 2?

How to evaluate XGBoost model with learning curves example 2?

This recipe helps you evaluate XGBoost model with learning curves example 2

0

Recipe Objective

While training a dataset sometimes we need to know how model is training with each row of data passed through it. Sometimes while training a very large dataset it takes a lots of time and for that we want to know that after passing speicific percentage of dataset what is the score of the model. So this can be done by learning curve. So here we are evaluating XGBoost with learning curves.

So this recipe is a short example of how we can visualise XGBoost model with learning curves.

Step 1 - Import the library

from numpy import loadtxt from xgboost import XGBClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score from matplotlib import pyplot import matplotlib.pyplot as plt plt.style.use("ggplot")

Here we have imported various modules like datasets, XGBClassifier and learning_curve from differnt libraries. We will understand the use of these later while using it in the in the code snippet.
For now just have a look on these imports.

Step 2 - Setup the Data

Here we have used datasets to load the inbuilt wine dataset and we have created objects X and y to store the data and the target value respectively. dataset = loadtxt("pima.indians.diabetes.data.csv", delimiter=",") X = dataset[:,0:8] Y = dataset[:,8] X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=7)

Step 3 - Model and Result

Here we are training XGBClassifier() and calculated the accuracy and the epochs. model = XGBClassifier() eval_set = [(X_train, y_train), (X_test, y_test)] model.fit(X_train, y_train, eval_metric=["error", "logloss"], eval_set=eval_set, verbose=False) y_pred = model.predict(X_test) predictions = [round(value) for value in y_pred] accuracy = accuracy_score(y_test, predictions) print("Accuracy: %.2f%%" % (accuracy * 100.0)) results = model.evals_result() epochs = len(results["validation_0"]["error"]) x_axis = range(0, epochs)

Step 4 - Ploting the Log loss and classification error

Finally, its time to plot the Log loss and classification error. We have used matplotlib to plot lines. # plot log loss fig, ax = pyplot.subplots(figsize=(12,12)) ax.plot(x_axis, results["validation_0"]["logloss"], label="Train") ax.plot(x_axis, results["validation_1"]["logloss"], label="Test") ax.legend() pyplot.ylabel("Log Loss") pyplot.title("XGBoost Log Loss") pyplot.show() # plot classification error fig, ax = pyplot.subplots(figsize=(12,12)) ax.plot(x_axis, results["validation_0"]["error"], label="Train") ax.plot(x_axis, results["validation_1"]["error"], label="Test") ax.legend() pyplot.ylabel("Classification Error") pyplot.title("XGBoost Classification Error") pyplot.show() As an output we get:


Relevant Projects

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.

Data Science Project on Wine Quality Prediction in R
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.