How to split train test data using sklearn and python?

How to split train test data using sklearn and python?

How to split train test data using sklearn and python?

This recipe helps you split train test data using sklearn and python

This python source code does the following: 1. Imports the necessary libraries 2. Loads Digit dataset from sklearn library 3. performs train_test_split on the dataset and manipulates its parameters
In [1]:
## How to split train test data using sklearn and python
def Snippet_131():
    print(format('How to split train test data using sklearn and python','*^82'))

    import warnings

    # load libraries
    from sklearn import datasets
    from sklearn.model_selection import train_test_split

    # Load the digits dataset
    digits = datasets.load_digits()

    # Create the features matrix
    X =
    print(); print(X.shape)

    # Create the target vector
    y =
    print(); print(y.shape)

    # Create training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,
    print(); print(X_train.shape)
    print(); print(X_test.shape)
    print(); print(y_train.shape)
    print(); print(y_test.shape)

**************How to split train test data using sklearn and python***************

(1797, 64)


(1203, 64)

(594, 64)



Relevant Projects

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Data Science Project-All State Insurance Claims Severity Prediction
Data science project in R to develop automated methods for predicting the cost and severity of insurance claims.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.