How to deal with imbalance classes with downsampling in Python?
DATA MUNGING

How to deal with imbalance classes with downsampling in Python?

How to deal with imbalance classes with downsampling in Python?

This recipe helps you deal with imbalance classes with downsampling in Python

0
This data science python source code does the following: 1. Imports necessary libraries and iris data from sklearn dataset 2. Use of "where" function for data handling 3. Downsamples the higher class to balance the data
In [1]:
## How to deal with imbalance classes with downsampling in Python 
def Kickstarter_Example_32():
    print()
    print(format('How to deal with imbalance classes with downsampling in Python', '*^82'))

    import warnings
    warnings.filterwarnings("ignore")

    # Load libraries
    import numpy as np
    from sklearn.datasets import load_iris

    # Load iris data
    iris = load_iris()

    # Create feature matrix
    X = iris.data

    # Create target vector
    y = iris.target

    # Make Iris Dataset Imbalanced # Remove first 40 observations
    X = X[40:,:]
    y = y[40:]

    # Create binary target vector indicating if class 0
    y = np.where((y == 0), 0, 1)

    # Look at the imbalanced target vector
    print(); print("Look at the imbalanced target vector:\n", y)

    # Downsample Majority Class To Match Minority Class
    # Indicies of each class' observations
    i_class0 = np.where(y == 0)[0]
    i_class1 = np.where(y == 1)[0]

    # Number of observations in each class
    n_class0 = len(i_class0); print(); print("n_class0: ", n_class0)
    n_class1 = len(i_class1); print(); print("n_class1: ", n_class1)

    # For every observation of class 0, randomly sample from class 1 without replacement
    i_class1_downsampled = np.random.choice(i_class1, size=n_class0, replace=False)

    # Join together class 0's target vector with the downsampled class 1's target vector
    print(); print(np.hstack((y[i_class0], y[i_class1_downsampled])))

Kickstarter_Example_32()
**********How to deal with imbalance classes with downsampling in Python**********

Look at the imbalanced target vector:
 [0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]

n_class0:  10

n_class1:  100

[0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1]

Relevant Projects

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.