How to split train test data using sklearn and python?

This recipe helps you split train test data using sklearn and python
Last Updated: 16 Dec 2022

Get access to Data Science projects View all Data Science projects

DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective

To train and test the data we need two different sets of data. The test set which works as completely new set of data for model and use to predict the output. But if we have a fix set of dataset provided then how to generate this test and train data.

So this is the recipe on how we can split train test data using sklearn and python.

Master the Art of Data Cleaning in Machine Learning

Recipe Objective

Step 1 - Import the library

from sklearn import datasets from sklearn.model_selection import train_test_split

We have only imported pandas which is needed.

Step 2 - Setting up the Data

We have imported an inbuilt wine dataset to use test_train_split. We have stored data in X and target in y. We have aslo printed the shape of the data. wine = datasets.load_wine() X = wine.data print(X.shape) y = wine.target print(y.shape)

Step 3 - Splitting the Data

So now we are using test_train_split to split the data. We have passed test_size as 0.33 which means 33% of data will be in the test part and rest will be in train part. Parameter random_state signifies the random splitting of data into the two parts. Finally we have printed the shape of test and train data. dX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42) print(X_train.shape) print(X_test.shape) print(y_train.shape) print(y_test.shape) So the output comes

(178, 13)

(178,)

(119, 13)

(59, 13)

(119,)

(59,)

Download Materials

iPython Notebook

What Users are saying..

Abhinav Agarwal

Graduate Student at Northwestern University

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Detectron2 Object Detection and Segmentation Example Python

Object Detection using Detectron2 - Build a Dectectron2 model to detect the zones and inhibitions in antibiogram images.

View Project Details

NLP and Deep Learning For Fake News Classification in Python

In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

View Project Details

Deploy Transformer-BART Model on Paperspace Cloud

In this MLOps Project you will learn how to deploy a Tranaformer BART Model for Abstractive Text Summarization on Paperspace Private Cloud

View Project Details

Time Series Forecasting Project-Building ARIMA Model in Python

Build a time series ARIMA model in Python to forecast the use of arrival rate density to support staffing decisions at call centres.

View Project Details

PyTorch Project to Build a LSTM Text Classification Model

In this PyTorch Project you will learn how to build an LSTM Text Classification model for Classifying the Reviews of an App .

View Project Details

Recommender System Machine Learning Project for Beginners-1

Recommender System Machine Learning Project for Beginners - Learn how to design, implement and train a rule-based recommender system in Python

View Project Details

Text Classification with Transformers-RoBERTa and XLNet Model

In this machine learning project, you will learn how to load, fine tune and evaluate various transformer models for text classification tasks.

View Project Details

Isolation Forest Model and LOF for Anomaly Detection in Python

Credit Card Fraud Detection Project - Build an Isolation Forest Model and Local Outlier Factor (LOF) in Python to identify fraudulent credit card transactions.

View Project Details

Deep Learning Project for Text Detection in Images using Python

CV2 Text Detection Code for Images using Python -Build a CRNN deep learning model to predict the single-line text in a given image.

View Project Details

Learn How to Build a Linear Regression Model in PyTorch

In this Machine Learning Project, you will learn how to build a simple linear regression model in PyTorch to predict the number of days subscribed.

View Project Details

How to split train test data using sklearn and python?

Recipe Objective

Table of Contents

Step 1 - Import the library

Step 2 - Setting up the Data

Step 3 - Splitting the Data

Abhinav Agarwal

Relevant Projects

You might also like

Relevant Projects