How to create a sparse Matrix in Python?
DATA MUNGING

How to create a sparse Matrix in Python?

How to create a sparse Matrix in Python?

This recipe helps you create a sparse Matrix in Python

3

There are two popular kinds of matrices: dense and sparse. Sparse matrices have lots of 'zero' values. In machine learning projects, the learning algorithms require the data to be in-memory. If the data needed for the learning (dataframe) is not in the RAM, then the algorithm does not work. By converting a dense matrix into a sparse matrix it can be made to fit in the RAM.

There are many data structures that can be used to construct a sparse matrix in python. Python Scipy provides the following ways to represent a sparse matrix:
- Block Sparse Row matrix (BSR)
- Coordinate list matrix (COO)
- Compressed Sparse Column matrix (CSC)
- Compressed Sparse Row matrix (CSR)
- Sparse matrix with DIAgonal storage (DIA)
- Dictionary Of Keys based sparse matrix (DOK)
- Row-based linked list sparse matrix (LIL)

The recipe above takes a dense matrix and displays the various formats of sparse matrix that scipy supports.

References: https://docs.scipy.org/doc/scipy/reference/sparse.html

In [1]:
## How to Create A Sparse Matrix
def Kickstarter_Example_2():
    print()
    print(format('How to Create A Sparse Matrix', '*^50'))

    # Load libraries
    import numpy as np
    from scipy import sparse

    # Create a matrix
    matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
    print()
    print("Original Matrix: \n", matrix)

    # Create sparse matrices
    print()
    print("Sparse Matrices: ")
    print()
    print(sparse.csr_matrix(matrix))
    print()
    print(sparse.bsr_matrix(matrix))
    print()
    print(sparse.coo_matrix(matrix))
    print()
    print(sparse.csc_matrix(matrix))
    print()
    print(sparse.dia_matrix(matrix))
    print()
    print(sparse.dok_matrix(matrix))
    print()
    print(sparse.lil_matrix(matrix))
    print()
Kickstarter_Example_2()
**********How to Create A Sparse Matrix***********

Original Matrix:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Sparse Matrices:

  (0, 0)	1
  (0, 1)	2
  (0, 2)	3
  (1, 0)	4
  (1, 1)	5
  (1, 2)	6
  (2, 0)	7
  (2, 1)	8
  (2, 2)	9

  (0, 0)	1
  (0, 1)	2
  (0, 2)	3
  (1, 0)	4
  (1, 1)	5
  (1, 2)	6
  (2, 0)	7
  (2, 1)	8
  (2, 2)	9

  (0, 0)	1
  (0, 1)	2
  (0, 2)	3
  (1, 0)	4
  (1, 1)	5
  (1, 2)	6
  (2, 0)	7
  (2, 1)	8
  (2, 2)	9

  (0, 0)	1
  (1, 0)	4
  (2, 0)	7
  (0, 1)	2
  (1, 1)	5
  (2, 1)	8
  (0, 2)	3
  (1, 2)	6
  (2, 2)	9

  (2, 0)	7
  (1, 0)	4
  (2, 1)	8
  (0, 0)	1
  (1, 1)	5
  (2, 2)	9
  (0, 1)	2
  (1, 2)	6
  (0, 2)	3

  (0, 0)	1
  (0, 1)	2
  (0, 2)	3
  (1, 0)	4
  (1, 1)	5
  (1, 2)	6
  (2, 0)	7
  (2, 1)	8
  (2, 2)	9

  (0, 0)	1
  (0, 1)	2
  (0, 2)	3
  (1, 0)	4
  (1, 1)	5
  (1, 2)	6
  (2, 0)	7
  (2, 1)	8
  (2, 2)	9

Relevant Projects

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Data Science Project-All State Insurance Claims Severity Prediction
Data science project in R to develop automated methods for predicting the cost and severity of insurance claims.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Music Recommendation System Project using Python and R
Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.