How to do PCA with Dask?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

# How to do PCA with Dask?

This recipe helps you do PCA with Dask

0

## Recipe Objective

How to do PCA with Dask?

PCA stands for **principal component Analysis**. It is used to reduce the dimensionality of a model using SVD to project in the lower dimensional data.

This algorithm depends on the size of the input data, SVD can be much more memory efficient than a PCA, and it allows sparse input as well. This algorithm has constant memory complexity.

``` #!pip install dask_ml #!pip install dask distributed --upgrade ```

## Step 1- Importing Libraries.

Importing PCA from dask_ml.decomposition along with other libraries.

``` import numpy as np import dask.array as da from dask_ml.decomposition import PCA ```

## Step 2- Creating arrays.

We will create multi dimensional array.

``` x = np.array([[1, -6], [2, -5], [3, -4], [4, -3], [5, -2], [6, -1]]) X = da.from_array(x, chunks=x.shape) ```

## Step 3- Applying PCA to the arrays.

We will reduce the features by applying PCA to the arrays.

``` pca = PCA(n_components=2) pca.fit(X) ```

## Step 4- Printing explained variance ratio.

We will print the explained variance ratio to better understand the model working.

``` print(pca.explained_variance_ratio_) ```

#### Relevant Projects

##### Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

##### Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

##### Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

##### Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

##### Census Income Data Set Project - Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based on census data.

##### Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

##### Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

##### Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

##### Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

##### Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.