How to check models AUC score using cross validation in Python?

This recipe helps you check models AUC score using cross validation in Python

Recipe Objective

While working on a classification model, we feel a need of a metric which can show us how our model is performing. A metric which can also give a graphical representation of the performance will be very helpful.

Area under ROC curve can efficiently give us the score that how our model is performing in classifing the labels. We can also plot graph between False Positive Rate and True Positive Rate with this ROC(Receiving Operating Characteristic) curve. The area under the ROC curve is a metric. Greater the area means better the performance.

This data science python source code does the following:
1. Classification metrics used for validation of model.
2. Performs train_test_split to seperate training and testing dataset
3. Implements CrossValidation on models and calculating the final result using "AUC_ROC method" method.

So this recipe is a short example of how can check model's AUC score using cross validation in Python

Get Closer To Your Dream of Becoming a Data Scientist with 70+ Solved End-to-End ML Projects

Step 1 - Import the library - GridSearchCv

from sklearn.model_selection import cross_val_score from sklearn.tree import DecisionTreeClassifier from sklearn import datasets

Here we have imported various modules like: datasets from which we will get the dataset, DecisionTreeClassifier and Cross_val_score.

Step 2 - Setup the Data

Here we have used datasets to load the inbuilt breast cancer dataset and we have created objects X and y to store the data and the target value respectively. cancer = datasets.load_breast_cancer() X = cancer.data print(X) y = cancer.target print(y)

Explore More Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro

Step 3 - Model and the cross Validation Score

We have used DecisionTreeClassifier as a model and then calculated cross validation score. While calculating Cross validation Score we have set the scoring parameter as roc_auc i.e. area under ROC and cv as 7. Then we have calculated the mean and standard deviation of the 7 scores we get. dtree = DecisionTreeClassifier() print(cross_val_score(dtree, X, y, scoring="roc_auc", cv = 7)) mean_score = cross_val_score(dtree, X, y, scoring="roc_auc", cv = 7).mean() std_score = cross_val_score(dtree, X, y, scoring="roc_auc", cv = 7).std() print(mean_score) print(std_score) As an output we get:

[[1.799e+01 1.038e+01 1.228e+02 ... 2.654e-01 4.601e-01 1.189e-01]
 [2.057e+01 1.777e+01 1.329e+02 ... 1.860e-01 2.750e-01 8.902e-02]
 [1.969e+01 2.125e+01 1.300e+02 ... 2.430e-01 3.613e-01 8.758e-02]
 ...
 [1.660e+01 2.808e+01 1.083e+02 ... 1.418e-01 2.218e-01 7.820e-02]
 [2.060e+01 2.933e+01 1.401e+02 ... 2.650e-01 4.087e-01 1.240e-01]
 [7.760e+00 2.454e+01 4.792e+01 ... 0.000e+00 2.871e-01 7.039e-02]]

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0
 1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1 1
 1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 0 1 1 1 1 0 1
 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 1 0 1 1 0 0 0 1 0
 1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 1 1 0 0 1 1
 1 0 1 1 1 1 1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 0 1 1 0 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1
 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 1 0 0 0 1 1
 1 1 0 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0
 0 1 0 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1
 1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 1 0 1 1
 0 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1
 1 1 1 1 1 1 0 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 1 0 1 0 0
 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 0 0 0 0 0 0 1]

[0.93833017 0.88646426 0.91078431 0.96372549 0.9372549  0.9372549
 0.90784314]

0.9346977500677692

0.024912511477023204

Download Materials

What Users are saying..

profile image

Ameeruddin Mohammed

ETL (Abintio) developer at IBM
linkedin profile url

I come from a background in Marketing and Analytics and when I developed an interest in Machine Learning algorithms, I did multiple in-class courses from reputed institutions though I got good... Read More

Relevant Projects

Hands-On Approach to Master PyTorch Tensors with Examples
In this deep learning project, you will learn how to perform various operations on the building block of PyTorch : Tensors.

Recommender System Machine Learning Project for Beginners-4
Collaborative Filtering Recommender System Project - Comparison of different model based and memory based methods to build recommendation system using collaborative filtering.

MLOps Project to Deploy Resume Parser Model on Paperspace
In this MLOps project, you will learn how to deploy a Resume Parser Streamlit Application on Paperspace Private Cloud.

Time Series Analysis with Facebook Prophet Python and Cesium
Time Series Analysis Project - Use the Facebook Prophet and Cesium Open Source Library for Time Series Forecasting in Python

Build a Text Generator Model using Amazon SageMaker
In this Deep Learning Project, you will train a Text Generator Model on Amazon Reviews Dataset using LSTM Algorithm in PyTorch and deploy it on Amazon SageMaker.

Build CNN Image Classification Models for Real Time Prediction
Image Classification Project to build a CNN model in Python that can classify images into social security cards, driving licenses, and other key identity information.

Learn How to Build a Logistic Regression Model in PyTorch
In this Machine Learning Project, you will learn how to build a simple logistic regression model in PyTorch for customer churn prediction.

Skip Gram Model Python Implementation for Word Embeddings
Skip-Gram Model word2vec Example -Learn how to implement the skip gram algorithm in NLP for word embeddings on a set of documents.

Tensorflow Transfer Learning Model for Image Classification
Image Classification Project - Build an Image Classification Model on a Dataset of T-Shirt Images for Binary Classification

Detectron2 Object Detection and Segmentation Example Python
Object Detection using Detectron2 - Build a Dectectron2 model to detect the zones and inhibitions in antibiogram images.