How to evaluate XGBoost model with learning curves example 2?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to evaluate XGBoost model with learning curves example 2?

How to evaluate XGBoost model with learning curves example 2?

This recipe helps you evaluate XGBoost model with learning curves example 2

0

Recipe Objective

While training a dataset sometimes we need to know how model is training with each row of data passed through it. Sometimes while training a very large dataset it takes a lots of time and for that we want to know that after passing speicific percentage of dataset what is the score of the model. So this can be done by learning curve. So here we are evaluating XGBoost with learning curves.

So this recipe is a short example of how we can visualise XGBoost model with learning curves.

Step 1 - Import the library

from numpy import loadtxt from xgboost import XGBClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score from matplotlib import pyplot import matplotlib.pyplot as plt plt.style.use("ggplot")

Here we have imported various modules like datasets, XGBClassifier and learning_curve from differnt libraries. We will understand the use of these later while using it in the in the code snippet.
For now just have a look on these imports.

Step 2 - Setup the Data

Here we have used datasets to load the inbuilt wine dataset and we have created objects X and y to store the data and the target value respectively. dataset = loadtxt("pima.indians.diabetes.data.csv", delimiter=",") X = dataset[:,0:8] Y = dataset[:,8] X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=7)

Step 3 - Model and Result

Here we are training XGBClassifier() and calculated the accuracy and the epochs. model = XGBClassifier() eval_set = [(X_train, y_train), (X_test, y_test)] model.fit(X_train, y_train, eval_metric=["error", "logloss"], eval_set=eval_set, verbose=False) y_pred = model.predict(X_test) predictions = [round(value) for value in y_pred] accuracy = accuracy_score(y_test, predictions) print("Accuracy: %.2f%%" % (accuracy * 100.0)) results = model.evals_result() epochs = len(results["validation_0"]["error"]) x_axis = range(0, epochs)

Step 4 - Ploting the Log loss and classification error

Finally, its time to plot the Log loss and classification error. We have used matplotlib to plot lines. # plot log loss fig, ax = pyplot.subplots(figsize=(12,12)) ax.plot(x_axis, results["validation_0"]["logloss"], label="Train") ax.plot(x_axis, results["validation_1"]["logloss"], label="Test") ax.legend() pyplot.ylabel("Log Loss") pyplot.title("XGBoost Log Loss") pyplot.show() # plot classification error fig, ax = pyplot.subplots(figsize=(12,12)) ax.plot(x_axis, results["validation_0"]["error"], label="Train") ax.plot(x_axis, results["validation_1"]["error"], label="Test") ax.legend() pyplot.ylabel("Classification Error") pyplot.title("XGBoost Classification Error") pyplot.show() As an output we get:


Relevant Projects

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

Music Recommendation System Project using Python and R
Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.