How to use SciPy Sparse matrix in Python?

This recipe explains How to use SciPy Sparse matrix in Python.

Sparse matrices are an essential tool in data analysis, machine learning, and scientific computing. They efficiently store and manipulate matrices with a substantial number of zero or insignificant elements, saving memory and computation time. In this guide, we will explore how to create, manipulate, and perform various operations with sparse matrices in Python.

Build Piecewise and Spline Regression Models in Python using libraries NumPy, Pandas, and SciPy 

Understanding SciPy Sparse Matrix in Python

A sparse matrix is a data structure designed to store and manipulate matrices with a large number of zero values efficiently. In contrast to traditional dense matrices, sparse matrices only store the non-zero elements, which significantly reduces memory usage and computational complexity. Python provides various libraries, including SciPy and NumPy, to work with sparse matrices.

How to create a Sparse Matrix in Python?

Many a times we work on matrices in Python and making Sparse Matrix manually is quite a hectic process but we know how to use Python, and using we can do this very well for us. There are two popular kinds of matrices: dense and sparse. Sparse matrices have lots of 'zero' values. In machine learning projects, the learning algorithms require the data to be in-memory. If the data needed for the learning (dataframe) is not in the RAM, then the algorithm does not work. By converting a dense matrix into a sparse matrix it can be made to fit in the RAM.

In this guide, we will walk you through creating sparse matrices using SciPy and explore different formats. We will create a dense matrix and then convert it into various formats of sparse matrices using SciPy.

Step 1: Import the Library

import numpy as np

from scipy import sparse

We have imported the necessary libraries to work with sparse matrices.

Step 2: Setting Up the Matrix

Next, we will create a dense matrix that we will use to create sparse matrices. Here's the original matrix:

matrix = np.array([[9, 8, 7],

                   [6, 5, 4],

                   [3, 2, 1]])

print("Original Matrix:\n", matrix)

This step sets up our original dense matrix.

Step 3: Creating Sparse Matrices

Now, we will create various formats of sparse matrices using the original dense matrix. Here are the different formats supported by SciPy:

  • Dictionary Of Keys based sparse matrix (DOK)

  • Block Sparse Row matrix (BSR)

  • Coordinate list matrix (COO)

  • Compressed Sparse Column matrix (CSC)

  • Compressed Sparse Row matrix (CSR)

  • Sparse matrix with DIAgonal storage (DIA)

  • Row-based linked list sparse matrix (LIL)

Let's create these sparse matrices:

Creating Dictionary Of Keys based sparse matrix (DOK)

print(sparse.dok_matrix(matrix))

Creating Block Sparse Row matrix (BSR)

print(sparse.bsr_matrix(matrix))

Creating Coordinate list matrix (COO)

print(sparse.coo_matrix(matrix))

Creating Compressed Sparse Column matrix (CSC)

print(sparse.csc_matrix(matrix))

Creating Compressed Sparse Row matrix (CSR)

print(sparse.csr_matrix(matrix))

Creating Sparse matrix with DIAgonal storage (DIA)

print(sparse.dia_matrix(matrix))

Creating Row-based linked list sparse matrix (LIL)

print(sparse.lil_matrix(matrix))

These steps demonstrate how to create different sparse matrix formats from a dense matrix using SciPy.

Now we are printing the final matrices and the output comes as:

Original Matrix: 

 [[9 8 7]

 [6 5 4]

 [3 2 1]]

Sparse Matrices: 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

 

  (0, 0) 9

  (1, 0) 6

  (2, 0) 3

  (0, 1) 8

  (1, 1) 5

  (2, 1) 2

  (0, 2) 7

  (1, 2) 4

  (2, 2) 1

 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

 

  (2, 0) 3

  (1, 0) 6

  (2, 1) 2

  (0, 0) 9

  (1, 1) 5

  (2, 2) 1

  (0, 1) 8

  (1, 2) 4

  (0, 2) 7

 

  (0, 0) 9

  (0, 1) 8

  (0, 2) 7

  (1, 0) 6

  (1, 1) 5

  (1, 2) 4

  (2, 0) 3

  (2, 1) 2

  (2, 2) 1

Converting sparse matrix to full matrix Python

You can convert a sparse matrix to a dense (full) matrix using the .toarray() method. Conversely, you can convert a dense matrix to a sparse matrix in Python to save memory.

dense_matrix = sparse_matrix.toarray()

sparse_matrix = csr_matrix(dense_matrix)

Eigenvalues of Sparse Matrix in Python

To find the eigenvalues of a sparse matrix, you can use libraries like SciPy, which provides functions like eigs for solving eigenvalue problems efficiently. Here's a guide to find the eigenvalues of a sparse matrix:

Step-1 Import the Libraries

We import the necessary libraries, including NumPy and SciPy.

import numpy as np

from scipy.sparse.linalg import eigs

from scipy.sparse import csc_matrix

Step-2 Create a Sparse Matrix

We create a sparse matrix using the csc_matrix constructor. You need to specify the data, row indices, column indices, and the shape of the matrix.

# Create a sparse matrix

data = np.array([1, 2, 3, 4, 5, 6, 7, 8])

row_indices = np.array([0, 0, 1, 1, 2, 2, 3, 3])

column_indices = np.array([0, 3, 1, 2, 0, 3, 1, 2])

sparse_matrix = csc_matrix((data, (row_indices, column_indices)), shape=(4, 4))

Step-3 Finding eigenvalues of a sparse matrix in Python

We use the eigs function from scipy.sparse.linalg to find the eigenvalues of the sparse matrix. The k parameter specifies the number of eigenvalues to compute.

# Find eigenvalues

eigenvalues, _ = eigs(sparse_matrix, k=3)

Step-4 Print the Eigenvalues

Finally, we print the eigenvalues of the sparse matrix.

print("Eigenvalues of the sparse matrix:")

print(eigenvalues)

Make sure to adjust the data, row indices, column indices, and shape according to your specific sparse matrix.

Sparse Matrix Operations

Sparse matrices support various matrix operations, such as addition, subtraction, multiplication, and more. You can perform these operations using the standard arithmetic operators or specialized functions from libraries like SciPy.

from scipy.sparse import csr_matrix

# Create sparse matrices

sparse_matrix1 = csr_matrix(...)

sparse_matrix2 = csr_matrix(...)

# Sparse Matrix sum in Python

result = sparse_matrix1 + sparse_matrix2

#Python Sparse Matrix multiplication

result = sparse_matrix1.dot(sparse_matrix2)

Saving and Loading Sparse Matrices

Use Python to save a sparse matrix to a file and load it later by using libraries like SciPy's scipy.sparse.save_npz and scipy.sparse.load_npz functions.

from scipy.sparse import save_npz, load_npz

# Save sparse matrix to a file

save_npz('sparse_matrix.npz', sparse_matrix)

# Load sparse matrix from a file

loaded_matrix = load_npz('sparse_matrix.npz')

Learn more about Sparse Matrices with ProjectPro!

Sparse matrices are a crucial tool for handling large-scale data and optimizing computational resources. In this guide, we've covered the basics of creating, converting, visualizing, and performing operations with sparse matrices in Python. These skills are invaluable for data scientists, machine learning practitioners, and researchers working with substantial datasets. To further enhance your knowledge and practical experience in data analysis, consider exploring ProjectPro, which offers a wide range of data science and big data  projects. Start your journey of learning and skill development with ProejctPro today.

Download Materials

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Build Portfolio Optimization Machine Learning Models in R
Machine Learning Project for Financial Risk Modelling and Portfolio Optimization with R- Build a machine learning model in R to develop a strategy for building a portfolio for maximized returns.

AWS MLOps Project to Deploy Multiple Linear Regression Model
Build and Deploy a Multiple Linear Regression Model in Python on AWS

Deep Learning Project for Beginners with Source Code Part 1
Learn to implement deep neural networks in Python .

Machine Learning Project to Forecast Rossmann Store Sales
In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

Build an AI Chatbot from Scratch using Keras Sequential Model
In this NLP Project, you will learn how to build an AI Chatbot from Scratch using Keras Sequential Model.

Build a Review Classification Model using Gated Recurrent Unit
In this Machine Learning project, you will build a classification model in python to classify the reviews of an app on a scale of 1 to 5 using Gated Recurrent Unit.

Build a Logistic Regression Model in Python from Scratch
Regression project to implement logistic regression in python from scratch on streaming app data.

Build a Multi Touch Attribution Machine Learning Model in Python
Identifying the ROI on marketing campaigns is an essential KPI for any business. In this ML project, you will learn to build a Multi Touch Attribution Model in Python to identify the ROI of various marketing efforts and their impact on conversions or sales..

AWS MLOps Project for ARCH and GARCH Time Series Models
Build and deploy ARCH and GARCH time series forecasting models in Python on AWS .

Build a Autoregressive and Moving Average Time Series Model
In this time series project, you will learn to build Autoregressive and Moving Average Time Series Models to forecast future readings, optimize performance, and harness the power of predictive analytics for sensor data.