How to do spectral clustering using Dask?

This recipe helps you do spectral clustering using Dask

Recipe Objective

How to do spectral clustering using Dask.

Spectral clustering scales the number of samples as per the model requirement, The Dask version uses an approximation to the affinity matrix, which reduces expensive computation.

#!pip install dask_ml --upgrade #!pip install dask distributed --upgrade

Step 1- Importing Libraries

We will import the dataset make_circles and the clusters from dask_ml.

from sklearn.datasets import make_circles from sklearn.utils import shuffle import pandas as pd from timeit import default_timer as tic import dask_ml.cluster

Step 2- Spllting dataset.

We will split the dataset into x and y to feed into clustering algorithm.

x, y = make_circles(n_samples=10_000, noise=0.05, random_state=0, factor=0.2)

Step 3- Creating Clusters.

We will do the spectral clustering, with defining the clusters and n_components, then we will fit the model.

Ns = [500, 1000, 2500, 5000] timings = [] for n in Ns: t1 = tic() dask_ml.cluster.SpectralClustering(n_clusters=2, n_components=100).fit(x) timings.append(('dask-ml (approximate)', n, tic() - t1)) df = pd.DataFrame(timings, columns=['method', 'Samples', 'Fitting Time']) df

What Users are saying..

profile image

Ray han

Tech Leader | Stanford / Yale University
linkedin profile url

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

Isolation Forest Model and LOF for Anomaly Detection in Python
Credit Card Fraud Detection Project - Build an Isolation Forest Model and Local Outlier Factor (LOF) in Python to identify fraudulent credit card transactions.

A/B Testing Approach for Comparing Performance of ML Models
The objective of this project is to compare the performance of BERT and DistilBERT models for building an efficient Question and Answering system. Using A/B testing approach, we explore the effectiveness and efficiency of both models and determine which one is better suited for Q&A tasks.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Time Series Classification Project for Elevator Failure Prediction
In this Time Series Project, you will predict the failure of elevators using IoT sensor data as a time series classification machine learning problem.

MLOps Project to Build Search Relevancy Algorithm with SBERT
In this MLOps SBERT project you will learn to build and deploy an accurate and scalable search algorithm on AWS using SBERT and ANNOY to enhance search relevancy in news articles.

FEAST Feature Store Example for Scaling Machine Learning
FEAST Feature Store Example- Learn to use FEAST Feature Store to manage, store, and discover features for customer churn prediction machine learning project.

Deploy Transformer-BART Model on Paperspace Cloud
In this MLOps Project you will learn how to deploy a Tranaformer BART Model for Abstractive Text Summarization on Paperspace Private Cloud

Recommender System Machine Learning Project for Beginners-2
Recommender System Machine Learning Project for Beginners Part 2- Learn how to build a recommender system for market basket analysis using association rule mining.

Model Deployment on GCP using Streamlit for Resume Parsing
Perform model deployment on GCP for resume parsing model using Streamlit App.