How to do KMeans Clustering in Python?

This recipe helps you do KMeans Clustering in Python

Recipe Objective

Have you ever tried to use Clustering by K nearest means.

So this recipe is a short example of how we we can do KMeans Clustering in Python.

Learn to Implement Customer Churn Prediction Using Machine Learning in Python

Step 1 - Import the library

from sklearn import datasets from sklearn.preprocessing import StandardScaler from sklearn.cluster import KMeans import pandas as pd import seaborn as sns import matplotlib.pyplot as plt

Here we have imported various modules like datasets, KMeans and test_train_split from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.

Step 2 - Setup the Data for classifier

Here we have used datasets to load the inbuilt iris dataset and we have created object X and made a dataframe. We have plotted a heat map of correlation between the features. iris = datasets.load_iris() X = iris.data data = pd.DataFrame(X) cor = data.corr() fig = plt.figure(figsize=(12,10)); sns.heatmap(cor, square = True); plt.show()

Step 3 - Model and its Score

Here, First we have used standardscaler to standarise the data such that the mean becomes zero and the standard deviation becomes 1. we are using Kmeans with n_clusters equals to 3 as a Machine Learning model to fit the data. scaler = StandardScaler() X_std = scaler.fit_transform(X) clt = KMeans(n_clusters=3) model = clt.fit(X_std) Now we have predicted the output by passing X_std and the clusters. clusters = pd.DataFrame(model.fit_predict(X_std)) data["Cluster"] = clusters Here we have ploted the clusters such that data points of a cluster have the same colour. fig = plt.figure(figsize=(12,10)); ax = fig.add_subplot(111) scatter = ax.scatter(data[0],data[1], c=data["Cluster"],s=50) ax.set_title("KMeans Clustering") ax.set_xlabel("X0"); ax.set_ylabel("X1") plt.colorbar(scatter); plt.show() Output comes as:

 

Download Materials

What Users are saying..

profile image

Gautam Vermani

Data Consultant at Confidential
linkedin profile url

Having worked in the field of Data Science, I wanted to explore how I can implement projects in other domains, So I thought of connecting with ProjectPro. A project that helped me absorb this topic... Read More

Relevant Projects

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Build a Speech-Text Transcriptor with Nvidia Quartznet Model
In this Deep Learning Project, you will leverage transfer learning from Nvidia QuartzNet pre-trained models to develop a speech-to-text transcriptor.

BigMart Sales Prediction ML Project in Python
The goal of the BigMart Sales Prediction ML project is to build and evaluate different predictive models and determine the sales of each product at a store.

Time Series Project to Build a Multiple Linear Regression Model
Learn to build a Multiple linear regression model in Python on Time Series Data

Loan Eligibility Prediction Project using Machine learning on GCP
Loan Eligibility Prediction Project - Use SQL and Python to build a predictive model on GCP to determine whether an application requesting loan is eligible or not.

Azure Text Analytics for Medical Search Engine Deployment
Microsoft Azure Project - Use Azure text analytics cognitive service to deploy a machine learning model into Azure Databricks

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

End-to-End Snowflake Healthcare Analytics Project on AWS-2
In this AWS Snowflake project, you will build an end to end retraining pipeline by checking Data and Model Drift and learn how to redeploy the model if needed

Learn How to Build a Logistic Regression Model in PyTorch
In this Machine Learning Project, you will learn how to build a simple logistic regression model in PyTorch for customer churn prediction.