How to use k fold cross validation in sklearn

This recipe helps you use k fold cross validation in sklearn. The k fold cross validation procedure is a method for estimating the performance of a ML algorithm on a dataset. The k fold cross validation procedure divides a limited dataset into k non overlapping folds.
Last Updated: 20 Jul 2022

Get access to Data Science projects View all Data Science projects

MACHINE LEARNING PROJECTS IN PYTHON DATA CLEANING PYTHON DATA MUNGING MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective - How to use k-fold cross-validation in sklearn?

K-Folds cross-validator.

The k-fold cross-validation procedure is a method for estimating the performance of an ML algorithm on a dataset. The k-fold cross-validation procedure divides a limited dataset into k non-overlapping folds. A total of k models are fit and evaluated on the k hold-out test sets and the mean performance is reported.

Each fold is then used once as validation while the k - 1 remaining folds form the training set.

Recipe Objective - How to use k-fold cross-validation in sklearn?

Links for the more related projects:-

https://www.projectpro.io/projects/data-science-projects/deep-learning-projects
https://www.projectpro.io/projects/data-science-projects/neural-network-projects

Example:-

Step:1 Import Libraries:-

from sklearn.model_selection import KFold import numpy as np # create the range 1 to 25 rn = range(1,26)

Step:2 Creating Folds:-

# to demonstrate how the data are split, we will create 3 and 5 folds. # it returns an location (index) of the train and test samples. kf5 = KFold(n_splits=5, shuffle=False) kf3 = KFold(n_splits=3, shuffle=False) # the Kfold function retunrs the indices of the data. Our range goes from 1-25 so the index is 0-24 for train_index, test_index in kf3.split(rn): print(train_index, test_index)

[ 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24] [0 1 2 3 4 5 6 7 8]
[ 0  1  2  3  4  5  6  7  8 17 18 19 20 21 22 23 24] [ 9 10 11 12 13 14 15 16]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16] [17 18 19 20 21 22 23 24]

KFold returns the index, if you want to see the real data we must use "np.take" in the NumPy array or ".iloc" in pandas.

# to get the values from our data, we use np.take() to access a value at particular index for train_index, test_index in kf3.split(rn): print(np.take(rn,train_index), np.take(rn,test_index))

[10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25] [1 2 3 4 5 6 7 8 9]
[ 1  2  3  4  5  6  7  8  9 18 19 20 21 22 23 24 25] [10 11 12 13 14 15 16 17]
[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17] [18 19 20 21 22 23 24 25]

What Users are saying..

Jingwei Li

Graduate Research assistance at Stony Brook University

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Deep Learning Project for Time Series Forecasting in Python

Deep Learning for Time Series Forecasting in Python -A Hands-On Approach to Build Deep Learning Models (MLP, CNN, LSTM, and a Hybrid Model CNN-LSTM) on Time Series Data.

View Project Details

Stock Price Prediction Project using LSTM and RNN

Learn how to predict stock prices using RNN and LSTM models. Understand deep learning concepts and apply them to real-world financial data for accurate forecasting.

View Project Details

Build an End-to-End AWS SageMaker Classification Model

MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.

View Project Details

Detectron2 Object Detection and Segmentation Example Python

Object Detection using Detectron2 - Build a Dectectron2 model to detect the zones and inhibitions in antibiogram images.

View Project Details

Model Deployment on GCP using Streamlit for Resume Parsing

Perform model deployment on GCP for resume parsing model using Streamlit App.

View Project Details

Linear Regression Model Project in Python for Beginners Part 1

Machine Learning Linear Regression Project in Python to build a simple linear regression model and master the fundamentals of regression for beginners.

View Project Details

Multi-Class Text Classification with Deep Learning using BERT

In this deep learning project, you will implement one of the most popular state of the art Transformer models, BERT for Multi-Class Text Classification

View Project Details

Many-to-One LSTM for Sentiment Analysis and Text Generation

In this LSTM Project , you will build develop a sentiment detection model using many-to-one LSTMs for accurate prediction of sentiment labels in airline text reviews. Additionally, we will also train many-to-one LSTMs on 'Alice's Adventures in Wonderland' to generate contextually relevant text.

View Project Details

AWS MLOps Project for ARCH and GARCH Time Series Models

Build and deploy ARCH and GARCH time series forecasting models in Python on AWS .

View Project Details

Walmart Sales Forecasting Data Science Project

Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

View Project Details

How to use k fold cross validation in sklearn

Recipe Objective - How to use k-fold cross-validation in sklearn?

Table of Contents

Links for the more related projects:-

Example:-

Step:1 Import Libraries:-

Step:2 Creating Folds:-

Jingwei Li

Relevant Projects

You might also like

Relevant Projects