How to do PCA with Dask?

This recipe helps you do PCA with Dask

Recipe Objective

How to do PCA with Dask?

PCA stands for **principal component Analysis**. It is used to reduce the dimensionality of a model using SVD to project in the lower dimensional data.

This algorithm depends on the size of the input data, SVD can be much more memory efficient than a PCA, and it allows sparse input as well. This algorithm has constant memory complexity.

#!pip install dask_ml #!pip install dask distributed --upgrade

 

Step 1- Importing Libraries.

Importing PCA from dask_ml.decomposition along with other libraries.

import numpy as np import dask.array as da from dask_ml.decomposition import PCA

Step 2- Creating arrays.

We will create multi dimensional array.

x = np.array([[1, -6], [2, -5], [3, -4], [4, -3], [5, -2], [6, -1]]) X = da.from_array(x, chunks=x.shape)

Step 3- Applying PCA to the arrays.

We will reduce the features by applying PCA to the arrays.

pca = PCA(n_components=2) pca.fit(X)

Step 4- Printing explained variance ratio.

We will print the explained variance ratio to better understand the model working.

print(pca.explained_variance_ratio_)

What Users are saying..

profile image

Gautam Vermani

Data Consultant at Confidential
linkedin profile url

Having worked in the field of Data Science, I wanted to explore how I can implement projects in other domains, So I thought of connecting with ProjectPro. A project that helped me absorb this topic... Read More

Relevant Projects

End-to-End ML Model Monitoring using Airflow and Docker
In this MLOps Project, you will learn to build an end to end pipeline to monitor any changes in the predictive power of model or degradation of data.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

CycleGAN Implementation for Image-To-Image Translation
In this GAN Deep Learning Project, you will learn how to build an image to image translation model in PyTorch with Cycle GAN.

Llama2 Project for MetaData Generation using FAISS and RAGs
In this LLM Llama2 Project, you will automate metadata generation using Llama2, RAGs, and AWS to reduce manual efforts.

Learn to Build Generative Models Using PyTorch Autoencoders
In this deep learning project, you will learn how to build a Generative Model using Autoencoders in PyTorch

GCP MLOps Project to Deploy ARIMA Model using uWSGI Flask
Build an end-to-end MLOps Pipeline to deploy a Time Series ARIMA Model on GCP using uWSGI and Flask

Recommender System Machine Learning Project for Beginners-4
Collaborative Filtering Recommender System Project - Comparison of different model based and memory based methods to build recommendation system using collaborative filtering.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Create Your First Chatbot with RASA NLU Model and Python
Learn the basic aspects of chatbot development and open source conversational AI RASA to create a simple AI powered chatbot on your own.

Build Deep Autoencoders Model for Anomaly Detection in Python
In this deep learning project , you will build and deploy a deep autoencoders model using Flask.