How to deal with imbalance classes with upsampling in Python?
DATA MUNGING

How to deal with imbalance classes with upsampling in Python?

How to deal with imbalance classes with upsampling in Python?

This recipe helps you deal with imbalance classes with upsampling in Python

0
This data science python source code does the following: 1. Imports necessary libraries and iris data from sklearn dataset 2. Use of "where" function for data handling 3. Upsamples the lower class to balance the data
In [1]:
## How to deal with imbalance classes with upsampling in Python 
def Kickstarter_Example_33():
    print()
    print(format('How to deal with imbalance classes with upsampling in Python', '*^82'))

    import warnings
    warnings.filterwarnings("ignore")

    # Load libraries
    import numpy as np
    from sklearn.datasets import load_iris

    # Load iris data
    iris = load_iris()

    # Create feature matrix
    X = iris.data

    # Create target vector
    y = iris.target

    # Make Iris Dataset Imbalanced # Remove first 40 observations
    X = X[40:,:]
    y = y[40:]

    # Create binary target vector indicating if class 0
    y = np.where((y == 0), 0, 1)

    # Look at the imbalanced target vector
    print(); print("Look at the imbalanced target vector:\n", y)

    # Downsample Majority Class To Match Minority Class
    # Indicies of each class' observations
    i_class0 = np.where(y == 0)[0]
    i_class1 = np.where(y == 1)[0]

    # Number of observations in each class
    n_class0 = len(i_class0); print(); print("n_class0: ", n_class0)
    n_class1 = len(i_class1); print(); print("n_class1: ", n_class1)

    # For every observation of class 1, randomly sample from class 0 with replacement
    i_class0_upsampled = np.random.choice(i_class0, size=n_class1, replace=True)

    # Join together class 1's target vector with the upsampled class 0's target vector
    print(); print(np.hstack((y[i_class0_upsampled], y[i_class1])))

Kickstarter_Example_33()
***********How to deal with imbalance classes with upsampling in Python***********

Look at the imbalanced target vector:
 [0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]

n_class0:  10

n_class1:  100

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]

Relevant Projects

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Anomaly Detection Using Deep Learning and Autoencoders
Deep Learning Project- Learn about implementation of a machine learning algorithm using autoencoders for anomaly detection.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.