How to do DBSCAN based Clustering in Python?

This recipe helps you do DBSCAN based Clustering in Python

Recipe Objective

One of the most important model of Machine Learning is Clustering. It takes a bunch of datapoints and put it in a perticular class based on some features.

So this recipe is a short example of how we can do DBSCAN based Clustering in Python

Step 1 - Import the library

from sklearn import datasets from sklearn.preprocessing import StandardScaler from sklearn.cluster import DBSCAN import pandas as pd import seaborn as sns import matplotlib.pyplot as plt

Here we have imported various modules like DBSCAN, datasets, StandardScale and many more from differnt libraries. We will understand the use of these later while using it in the in the code snipet.
For now just have a look on these imports.

Step 2 - Setup the Data

Here we have used datasets to load the inbuilt iris dataset and we have created objects X and y to store the data and the target value respectively. iris = datasets.load_iris() X = iris.data data = pd.DataFrame(X)

Step 3 - Using StandardScaler and Clustering

StandardScaler is used to remove the outliners and scale the data by making the mean of the data 0 and standard deviation as 1. So we are creating an object std_scl to use standardScaler. std_slc = StandardScaler() X_std = std_slc.fit_transform(X)

We are using DBSCAN as a model and we have trained it by using the data we get after standerd scaling. Then we predicted the clusters and stored it in a dataframe. clt = DBSCAN() model = clt.fit(X_std) clusters = pd.DataFrame(model.fit_predict(X_std)) data["Cluster"] = clusters

Step 4 - Visualising the clusters

Here we are ploting scatterplot of the dataset and marking clusters in same colors. fig = plt.figure(figsize=(10,10)); ax = fig.add_subplot(111) scatter = ax.scatter(data[0],data[1], c=data["Cluster"],s=50) ax.set_title("DBSCAN Clustering") ax.set_xlabel("X0") ax.set_ylabel("X1") plt.colorbar(scatter) plt.show() As an output we get

 

Download Materials

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

PyTorch Project to Build a GAN Model on MNIST Dataset
In this deep learning project, you will learn how to build a GAN Model on MNIST Dataset for generating new images of handwritten digits.

Build a Face Recognition System in Python using FaceNet
In this deep learning project, you will build your own face recognition system in Python using OpenCV and FaceNet by extracting features from an image of a person's face.

Build a Graph Based Recommendation System in Python -Part 1
Python Recommender Systems Project - Learn to build a graph based recommendation system in eCommerce to recommend products.

Many-to-One LSTM for Sentiment Analysis and Text Generation
In this LSTM Project , you will build develop a sentiment detection model using many-to-one LSTMs for accurate prediction of sentiment labels in airline text reviews. Additionally, we will also train many-to-one LSTMs on 'Alice's Adventures in Wonderland' to generate contextually relevant text.

NLP Project on LDA Topic Modelling Python using RACE Dataset
Use the RACE dataset to extract a dominant topic from each document and perform LDA topic modeling in python.

Build an End-to-End AWS SageMaker Classification Model
MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.

Build Portfolio Optimization Machine Learning Models in R
Machine Learning Project for Financial Risk Modelling and Portfolio Optimization with R- Build a machine learning model in R to develop a strategy for building a portfolio for maximized returns.

Build an AI Chatbot from Scratch using Keras Sequential Model
In this NLP Project, you will learn how to build an AI Chatbot from Scratch using Keras Sequential Model.

NLP Project to Build a Resume Parser in Python using Spacy
Use the popular Spacy NLP python library for OCR and text classification to build a Resume Parser in Python.

Build a Text Generator Model using Amazon SageMaker
In this Deep Learning Project, you will train a Text Generator Model on Amazon Reviews Dataset using LSTM Algorithm in PyTorch and deploy it on Amazon SageMaker.