How to impute missing values with means in Python?
DATA MUNGING

How to impute missing values with means in Python?

How to impute missing values with means in Python?

This recipe helps you impute missing values with means in Python

0
In [2]:
## How to impute missing values with means in Python 
def Kickstarter_Example_35():
    print()
    print(format('How to impute missing values with means in Python', '*^82'))

    import warnings
    warnings.filterwarnings("ignore")

    # load libraries
    import pandas as pd
    import numpy as np
    from sklearn.preprocessing import Imputer

    # Create an empty dataset
    df = pd.DataFrame()

    # Create two variables called x0 and x1. Make the first value of x1 a missing value
    df['V0'] = [0.3051,0.4949,0.6974,0.3769,0.2231,
                0.341,0.4436,0.5897,0.6308,0.5]
    df['V1'] = [np.nan,np.nan,0.2615,0.5846,0.4615,
                0.8308,0.4962,np.nan,0.5346,0.6731]

    # View the dataset
    print(); print(df)

    # Create an imputer object that looks for 'Nan' values, 
    # then replaces them with the mean value of the feature by columns (axis=0)
    mean_imputer = Imputer(missing_values='NaN', strategy='mean', axis=0)

    # Train the imputor on the df dataset
    mean_imputer = mean_imputer.fit(df)

    # Apply the imputer to the df dataset
    imputed_df = mean_imputer.transform(df.values)

    # View the data
    print(); print(imputed_df)

Kickstarter_Example_35()
****************How to impute missing values with means in Python*****************

       V0      V1
0  0.3051     NaN
1  0.4949     NaN
2  0.6974  0.2615
3  0.3769  0.5846
4  0.2231  0.4615
5  0.3410  0.8308
6  0.4436  0.4962
7  0.5897     NaN
8  0.6308  0.5346
9  0.5000  0.6731

[[0.3051 0.5489]
 [0.4949 0.5489]
 [0.6974 0.2615]
 [0.3769 0.5846]
 [0.2231 0.4615]
 [0.341  0.8308]
 [0.4436 0.4962]
 [0.5897 0.5489]
 [0.6308 0.5346]
 [0.5    0.6731]]

Relevant Projects

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Sequence Classification with LSTM RNN in Python with Keras
In this project, we are going to work on Sequence to Sequence Prediction using IMDB Movie Review Dataset​ using Keras in Python.

Music Recommendation System Project using Python and R
Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Anomaly Detection Using Deep Learning and Autoencoders
Deep Learning Project- Learn about implementation of a machine learning algorithm using autoencoders for anomaly detection.