How to split a dataset using pytorch

This recipe helps you split a dataset using pytorch
Last Updated: 19 Dec 2022

Get access to Data Science projects View all Data Science projects

DATA SCIENCE PROJECTS IN PYTHON DATA CLEANING PYTHON DATA MUNGING MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective

How to split a dataset using pytorch?

This is achieved by using the "random_split" function, the function is used to split a dataset into more than one sub datasets, it is also used to create train and test datasets.

PyTorch vs Tensorflow - Which One Should You Choose For Your Next Deep Learning Project ?

Recipe Objective

Step 1 - Import library

import pprint as pp from sklearn import datasets import numpy as np import torch from torch.utils.data import Dataset from torch.utils.data import random_split

Step 2 - Take Sample data

samples = 2000 X_data, Y_data = datasets.make_blobs(n_samples= samples, n_features=4, centers=[(0,5),(4,0)], random_state=0)

Step 3 - Create Dataset Class

class CreateDataset(Dataset): def __init__(self, x, y): self.x = X_data self.y = Y_data def __getitem__(self, index): sample = { 'feature': torch.tensor([self.x[index]], dtype=torch.float32), 'label': torch.tensor([self.y[index]], dtype=torch.long)} return sample def __len__(self): return len(self.x)

Step 4 - Create dataset and check length of it

torch_dataset = CreateDataset(X_data, Y_data) print("length of the dataset is:", len(torch_dataset))

length of the dataset is: 2000

Step 5 - Split the dataset

train_data, test_data = random_split(torch_dataset, [1400, 600]) print("The length of train data is:",len(train_data)) print("The length of test data is:",len(test_data))

The length of train data is: 1400
The length of test data is: 600

What Users are saying..

Anand Kumpatla

Sr Data Scientist @ Doubleslash Software Solutions Pvt Ltd

ProjectPro is a unique platform and helps many people in the industry to solve real-life problems with a step-by-step walkthrough of projects. A platform with some fantastic resources to gain... Read More