This recipe helps you do MinShift Clustering in Python


Recipe Objective

Have you ever tried to do Meannshift based Clustering in python? Clustering can give us an idea that how the data set is in groups and Meanshift based is very usefull sometimes.

So this is the recipe on how we can do MeanShift based Clustering in Python.

Step 1 - Import the library

from sklearn import datasets from sklearn.preprocessing import StandardScaler from sklearn.cluster import MeanShift import pandas as pd import seaborn as sns import matplotlib.pyplot as plt

We have imported datasets, StandardScaler, MinShift, pandas, and seaborn which will be needed for the dataset.

Step 2 - Setting up the Data

We have imported inbuilt breast cancer dataset and stored data in x. We have plotted a heatmap for corelation of features. cancer = datasets.load_breast_cancer() X = data = pd.DataFrame(X) cor = data.corr() fig = plt.figure(figsize=(10,10)) sns.heatmap(cor, square = True);

Step 3 - Training model and Predicting Clusters

Here we we are first standarizing the data by standardscaler. Standardscaler scales the data such that its mean becomes 0 and standard scaler becomes 1. scaler = StandardScaler() X_std = scaler.fit_transform(X) Now we are using MeanShift for clustering with features: clt = MeanShift() We are training the data by using and printing the number of clusters. model = Finally we are predicting the clusters. clusters = pd.DataFrame(model.fit_predict(X_std)) data["Cluster"] = clusters

Step 4 - Visualizing the output

fig = plt.figure(figsize=(10,10)); ax = fig.add_subplot(111) scatter = ax.scatter(data[0],data[1], c=data["Cluster"],s=50) ax.set_title("MinShift Clustering") ax.set_xlabel("X0"); ax.set_ylabel("X1") plt.colorbar(scatter);

We have plot a scatter plot which will show the clusters of data in different colour,

