How to use SciPy Sparse matrix in Python?

This recipe explains How to use SciPy Sparse matrix in Python.

Sparse matrices are an essential tool in data analysis, machine learning, and scientific computing. They efficiently store and manipulate matrices with a substantial number of zero or insignificant elements, saving memory and computation time. In this guide, we will explore how to create, manipulate, and perform various operations with sparse matrices in Python.

Build Piecewise and Spline Regression Models in Python using libraries NumPy, Pandas, and SciPy 

Understanding SciPy Sparse Matrix in Python

A sparse matrix is a data structure designed to store and manipulate matrices with a large number of zero values efficiently. In contrast to traditional dense matrices, sparse matrices only store the non-zero elements, which significantly reduces memory usage and computational complexity. Python provides various libraries, including SciPy and NumPy, to work with sparse matrices.

How to create a Sparse Matrix in Python?

Many a times we work on matrices in Python and making Sparse Matrix manually is quite a hectic process but we know how to use Python, and using we can do this very well for us. There are two popular kinds of matrices: dense and sparse. Sparse matrices have lots of 'zero' values. In machine learning projects, the learning algorithms require the data to be in-memory. If the data needed for the learning (dataframe) is not in the RAM, then the algorithm does not work. By converting a dense matrix into a sparse matrix it can be made to fit in the RAM.

In this guide, we will walk you through creating sparse matrices using SciPy and explore different formats. We will create a dense matrix and then convert it into various formats of sparse matrices using SciPy.

Step 1: Import the Library

import numpy as np

from scipy import sparse

We have imported the necessary libraries to work with sparse matrices.

Step 2: Setting Up the Matrix

Next, we will create a dense matrix that we will use to create sparse matrices. Here's the original matrix:

matrix = np.array([[9, 8, 7],

                   [6, 5, 4],

                   [3, 2, 1]])

print("Original Matrix:\n", matrix)

This step sets up our original dense matrix.

Step 3: Creating Sparse Matrices

Now, we will create various formats of sparse matrices using the original dense matrix. Here are the different formats supported by SciPy:

  • Dictionary Of Keys based sparse matrix (DOK)

  • Block Sparse Row matrix (BSR)

  • Coordinate list matrix (COO)

  • Compressed Sparse Column matrix (CSC)

  • Compressed Sparse Row matrix (CSR)

  • Sparse matrix with DIAgonal storage (DIA)

  • Row-based linked list sparse matrix (LIL)

Let's create these sparse matrices:

Creating Dictionary Of Keys based sparse matrix (DOK)

print(sparse.dok_matrix(matrix))

Creating Block Sparse Row matrix (BSR)

print(sparse.bsr_matrix(matrix))

Creating Coordinate list matrix (COO)

print(sparse.coo_matrix(matrix))

Creating Compressed Sparse Column matrix (CSC)

print(sparse.csc_matrix(matrix))

Creating Compressed Sparse Row matrix (CSR)

print(sparse.csr_matrix(matrix))

Creating Sparse matrix with DIAgonal storage (DIA)

print(sparse.dia_matrix(matrix))

Creating Row-based linked list sparse matrix (LIL)

print(sparse.lil_matrix(matrix))

These steps demonstrate how to create different sparse matrix formats from a dense matrix using SciPy.

Now we are printing the final matrices and the output comes as:

Original Matrix: 

 [[9 8 7]

 [6 5 4]

 [3 2 1]]

Sparse Matrices: 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

 

  (0, 0) 9

  (1, 0) 6

  (2, 0) 3

  (0, 1) 8

  (1, 1) 5

  (2, 1) 2

  (0, 2) 7

  (1, 2) 4

  (2, 2) 1

 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

 

  (2, 0) 3

  (1, 0) 6

  (2, 1) 2

  (0, 0) 9

  (1, 1) 5

  (2, 2) 1

  (0, 1) 8

  (1, 2) 4

  (0, 2) 7

 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

Converting sparse matrix to full matrix Python

You can convert a sparse matrix to a dense (full) matrix using the .toarray() method. Conversely, you can convert a dense matrix to a sparse matrix in Python to save memory.

dense_matrix = sparse_matrix.toarray()

sparse_matrix = csr_matrix(dense_matrix)

Eigenvalues of Sparse Matrix in Python

To find the eigenvalues of a sparse matrix, you can use libraries like SciPy, which provides functions like eigs for solving eigenvalue problems efficiently. Here's a guide to find the eigenvalues of a sparse matrix:

Step-1 Import the Libraries

We import the necessary libraries, including NumPy and SciPy.

import numpy as np

from scipy.sparse.linalg import eigs

from scipy.sparse import csc_matrix

Step-2 Create a Sparse Matrix

We create a sparse matrix using the csc_matrix constructor. You need to specify the data, row indices, column indices, and the shape of the matrix.

# Create a sparse matrix

data = np.array([1, 2, 3, 4, 5, 6, 7, 8])

row_indices = np.array([0, 0, 1, 1, 2, 2, 3, 3])

column_indices = np.array([0, 3, 1, 2, 0, 3, 1, 2])

sparse_matrix = csc_matrix((data, (row_indices, column_indices)), shape=(4, 4))

Step-3 Finding eigenvalues of a sparse matrix in Python

We use the eigs function from scipy.sparse.linalg to find the eigenvalues of the sparse matrix. The k parameter specifies the number of eigenvalues to compute.

# Find eigenvalues

eigenvalues, _ = eigs(sparse_matrix, k=3)

Step-4 Print the Eigenvalues

Finally, we print the eigenvalues of the sparse matrix.

print("Eigenvalues of the sparse matrix:")

print(eigenvalues)

Make sure to adjust the data, row indices, column indices, and shape according to your specific sparse matrix.

Sparse Matrix Operations

Sparse matrices support various matrix operations, such as addition, subtraction, multiplication, and more. You can perform these operations using the standard arithmetic operators or specialized functions from libraries like SciPy.

from scipy.sparse import csr_matrix

# Create sparse matrices

sparse_matrix1 = csr_matrix(...)

sparse_matrix2 = csr_matrix(...)

# Sparse Matrix sum in Python

result = sparse_matrix1 + sparse_matrix2

#Python Sparse Matrix multiplication

result = sparse_matrix1.dot(sparse_matrix2)

Saving and Loading Sparse Matrices

Use Python to save a sparse matrix to a file and load it later by using libraries like SciPy's scipy.sparse.save_npz and scipy.sparse.load_npz functions.

from scipy.sparse import save_npz, load_npz

# Save sparse matrix to a file

save_npz('sparse_matrix.npz', sparse_matrix)

# Load sparse matrix from a file

loaded_matrix = load_npz('sparse_matrix.npz')

Learn more about Sparse Matrices with ProjectPro!

Sparse matrices are a crucial tool for handling large-scale data and optimizing computational resources. In this guide, we've covered the basics of creating, converting, visualizing, and performing operations with sparse matrices in Python. These skills are invaluable for data scientists, machine learning practitioners, and researchers working with substantial datasets. To further enhance your knowledge and practical experience in data analysis, consider exploring ProjectPro, which offers a wide range of data science and big data  projects. Start your journey of learning and skill development with ProejctPro today.

Download Materials

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

GCP MLOps Project to Deploy ARIMA Model using uWSGI Flask
Build an end-to-end MLOps Pipeline to deploy a Time Series ARIMA Model on GCP using uWSGI and Flask

BigMart Sales Prediction ML Project in Python
The goal of the BigMart Sales Prediction ML project is to build and evaluate different predictive models and determine the sales of each product at a store.

Build Regression (Linear,Ridge,Lasso) Models in NumPy Python
In this machine learning regression project, you will learn to build NumPy Regression Models (Linear Regression, Ridge Regression, Lasso Regression) from Scratch.

Digit Recognition using CNN for MNIST Dataset in Python
In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

Learn Hyperparameter Tuning for Neural Networks with PyTorch
In this Deep Learning Project, you will learn how to optimally tune the hyperparameters (learning rate, epochs, dropout, early stopping) of a neural network model in PyTorch to improve model performance.

Build a Autoregressive and Moving Average Time Series Model
In this time series project, you will learn to build Autoregressive and Moving Average Time Series Models to forecast future readings, optimize performance, and harness the power of predictive analytics for sensor data.

Classification Projects on Machine Learning for Beginners - 2
Learn to implement various ensemble techniques to predict license status for a given business.

Isolation Forest Model and LOF for Anomaly Detection in Python
Credit Card Fraud Detection Project - Build an Isolation Forest Model and Local Outlier Factor (LOF) in Python to identify fraudulent credit card transactions.

Build an End-to-End AWS SageMaker Classification Model
MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.