How to split train test data using sklearn and python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to split train test data using sklearn and python?

How to split train test data using sklearn and python?

This recipe helps you split train test data using sklearn and python

3
This python source code does the following: 1. Imports the necessary libraries 2. Loads Digit dataset from sklearn library 3. performs train_test_split on the dataset and manipulates its parameters
In [1]:
## How to split train test data using sklearn and python
def Snippet_131():
    print()
    print(format('How to split train test data using sklearn and python','*^82'))

    import warnings
    warnings.filterwarnings("ignore")

    # load libraries
    from sklearn import datasets
    from sklearn.model_selection import train_test_split

    # Load the digits dataset
    digits = datasets.load_digits()

    # Create the features matrix
    X = digits.data
    print(); print(X.shape)

    # Create the target vector
    y = digits.target
    print(); print(y.shape)

    # Create training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,
                                                        random_state=42)
    print(); print(X_train.shape)
    print(); print(X_test.shape)
    print(); print(y_train.shape)
    print(); print(y_test.shape)

Snippet_131()
**************How to split train test data using sklearn and python***************

(1797, 64)

(1797,)

(1203, 64)

(594, 64)

(1203,)

(594,)

Relevant Projects

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Music Recommendation System Project using Python and R
Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.