How to impute missing class labels in Python?

This recipe helps you impute missing class labels in Python


Recipe Objective

In many dataset we find null values in the features so how to manage and fill the null values.

So this is the recipe on how we can impute missing class labels in Python.

Step 1 - Import the library

import numpy as np from sklearn.preprocessing import Imputer

We have imported numpy and Imputer which is needed.

Step 2 - Creating Data

We have created a matrix with different values in it and also with null values. X = np.array([[2, 2.15, 1.5], [1, 1.64, 1.25], [2, 1.15, 1.45], [0, -0.45, -1.52], [np.nan, 0.54, 1.15], [np.nan, -0.65, -0.61]])

Step 3 - Imputing Missing values

We have created an Object for Imputer with parameters strategy in which we have to pass the method of imputing and 0 or 1 in axis for rows and columns. We have used fit_transform to fit the data and impute values in null. imputer = Imputer(strategy="most_frequent", axis=0) print(X) print(imputer.fit_transform(X))

[[ 2.    2.15  1.5 ]
 [ 1.    1.64  1.25]
 [ 2.    1.15  1.45]
 [ 0.   -0.45 -1.52]
 [  nan  0.54  1.15]
 [  nan -0.65 -0.61]]

[[ 2.    2.15  1.5 ]
 [ 1.    1.64  1.25]
 [ 2.    1.15  1.45]
 [ 0.   -0.45 -1.52]
 [ 2.    0.54  1.15]
 [ 2.   -0.65 -0.61]]

