DATA MUNGING
DATA CLEANING PYTHON
MACHINE LEARNING RECIPES
PANDAS CHEATSHEET
ALL TAGS
# How to impute missing class labels using nearest neighbours in Python?

# How to impute missing class labels using nearest neighbours in Python?

This recipe helps you impute missing class labels using nearest neighbours in Python

Have you ever tried to impute calss labels? We can impute class labels by K nearest neighbours by training it on known data and predicting the class labels.

So this is the recipe on how we can impute missing class labels using nearest neighbours in Python.

```
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
```

We have imported numpy and KNeighborsClassifier which is needed.

We have created a feature matrix using array and we will use this to train the KNN model.
```
X = np.array([[0, 2.10, 1.45],
[2, 1.18, 1.33],
[0, 1.22, 1.27],
[1, 1.32, 1.97],
[1, -0.21, -1.19]])
```

We have created a matrix with missing class labels.
```
X_with_nan = np.array([[np.nan, 0.87, 1.31],
[np.nan, 0.37, 1.91],
[np.nan, 0.54, 1.27],
[np.nan, -0.67, -0.22]])
```

We are training the KNeighborsClassifier with parameters K equals to 3 and weights equals to distance. We have used the matrix X to train the model.
```
clf = KNeighborsClassifier(3, weights="distance")
trained_model = clf.fit(X[:,1:], X[:,0])
```

We have predicted the class labels of matrix "X_with_nan".
```
imputed_values = trained_model.predict(X_with_nan[:,1:])
print(imputed_values)
```

So finally we have filled the null values with the predicted output of model.
```
X_with_imputed = np.hstack((imputed_values.reshape(-1,1), X_with_nan[:,1:]))
print(); print(X_with_imputed)
```

So the output comes as

[2. 1. 2. 1.] [[ 2. 0.87 1.31] [ 1. 0.37 1.91] [ 2. 0.54 1.27] [ 1. -0.67 -0.22]]

Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.