Have you ever tried to do Meannshift based Clustering in python? Clustering can give us an idea that how the data set is in groups and Meanshift based is very usefull sometimes.
So this is the recipe on how we can do MeanShift based Clustering in Python.
from sklearn import datasets from sklearn.preprocessing import StandardScaler from sklearn.cluster import MeanShift import pandas as pd import seaborn as sns import matplotlib.pyplot as plt
We have imported datasets, StandardScaler, MinShift, pandas, and seaborn which will be needed for the dataset.
We have imported inbuilt breast cancer dataset and stored data in x. We have plotted a heatmap for corelation of features.
cancer = datasets.load_breast_cancer()
X = cancer.data
data = pd.DataFrame(X)
cor = data.corr()
fig = plt.figure(figsize=(10,10))
sns.heatmap(cor, square = True); plt.show()
Here we we are first standarizing the data by standardscaler. Standardscaler scales the data such that its mean becomes 0 and standard scaler becomes 1.
scaler = StandardScaler()
X_std = scaler.fit_transform(X)
Now we are using MeanShift for clustering with features:
clt = MeanShift()
We are training the data by using clt.fit and printing the number of clusters.
model = clt.fit(X_std)
Finally we are predicting the clusters.
clusters = pd.DataFrame(model.fit_predict(X_std))
data["Cluster"] = clusters
fig = plt.figure(figsize=(10,10)); ax = fig.add_subplot(111) scatter = ax.scatter(data,data, c=data["Cluster"],s=50) ax.set_title("MinShift Clustering") ax.set_xlabel("X0"); ax.set_ylabel("X1") plt.colorbar(scatter); plt.show()
We have plot a scatter plot which will show the clusters of data in different colour,