How to reduce dimentionality using PCA in Python?
DATA MUNGING

How to reduce dimentionality using PCA in Python?

How to reduce dimentionality using PCA in Python?

This recipe helps you reduce dimentionality using PCA in Python

0
In [1]:
## How to reduce dimentionality using PCA in Python
def Snippet_123():
    print()
    print(format('How to reduce dimentionality using PCA in Python','*^82'))

    import warnings
    warnings.filterwarnings("ignore")

    # load libraries
    from sklearn.preprocessing import StandardScaler
    from sklearn.decomposition import PCA
    from sklearn import datasets

    # Load Digits Data And Make Sparse
    digits = datasets.load_digits()

    # Standardize the feature matrix
    X = StandardScaler().fit_transform(digits.data)
    print(); print(X)

    # Conduct Principal Component Analysis
    # Create a PCA that will retain 85% of the variance
    pca = PCA(n_components=0.85, whiten=True)

    # Conduct PCA
    X_pca = pca.fit_transform(X)
    print(); print(X_pca)

    # Show results
    print('Original number of features:', X.shape[1])
    print('Reduced number of features:', X_pca.shape[1])

    # Create a PCA with 2 components
    pca = PCA(n_components=2, whiten=True)
    # Conduct PCA
    X_pca = pca.fit_transform(X)
    print(); print(X_pca)
    # Show results
    print('Original number of features:', X.shape[1])
    print('Reduced number of features:', X_pca.shape[1])

Snippet_123()
*****************How to reduce dimentionality using PCA in Python*****************

[[ 0.         -0.33501649 -0.04308102 ... -1.14664746 -0.5056698
  -0.19600752]
 [ 0.         -0.33501649 -1.09493684 ...  0.54856067 -0.5056698
  -0.19600752]
 [ 0.         -0.33501649 -1.09493684 ...  1.56568555  1.6951369
  -0.19600752]
 ...
 [ 0.         -0.33501649 -0.88456568 ... -0.12952258 -0.5056698
  -0.19600752]
 [ 0.         -0.33501649 -0.67419451 ...  0.8876023  -0.5056698
  -0.19600752]
 [ 0.         -0.33501649  1.00877481 ...  0.8876023  -0.26113572
  -0.19600752]]

[[ 0.70631939 -0.39512814 -1.73816236 ...  0.60320435 -0.94455291
  -0.60204272]
 [ 0.21732591  0.38276482  1.72878893 ... -0.56722002  0.61131544
   1.02457999]
 [ 0.4804351  -0.13130437  1.33172761 ... -1.51284419 -0.48470912
  -0.52826811]
 ...
 [ 0.37732433 -0.0612296   1.0879821  ...  0.04925597  0.29271531
  -0.33891255]
 [ 0.39705007 -0.15768102 -1.08160094 ...  1.31785641  0.38883981
  -1.21854835]
 [-0.46407544 -0.92213976  0.12493334 ... -1.27242756 -0.34190284
  -1.17852306]]
Original number of features: 64
Reduced number of features: 25

[[ 0.70632396 -0.3951369 ]
 [ 0.21732429  0.38276531]
 [ 0.48042968 -0.1313031 ]
 ...
 [ 0.37732239 -0.06123449]
 [ 0.3970504  -0.15768443]
 [-0.46407124 -0.92214378]]
Original number of features: 64
Reduced number of features: 2

Relevant Projects

Anomaly Detection Using Deep Learning and Autoencoders
Deep Learning Project- Learn about implementation of a machine learning algorithm using autoencoders for anomaly detection.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.

Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.