How to draw a matrix of scatter plots using pandas?

This recipe helps you draw a matrix of scatter plots using pandas

Recipe Objective

Checking for collinearity among attributes of a dataset, is one of the most important steps in data preprocessing. A good way to understand the correlation among the features, is to create scatter plots for each pair of attributes.

So this recipe is a short example on How to draw a matrix of scatter plots using pandas. Let's get started.

Step 1 - Import the library

import pandas as pd import seaborn as sb

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays. Seaborn is just used in here to import dataset.

Step 2 - Setup the Data

df = sb.load_dataset('tips')

Here we have imported tips dataset from seaborn library.

Now our dataset is ready.

Step 3 - Plotting Scatter matrix

pd.plotting.scatter_matrix(df[['total_bill','tip','size']], alpha=0.2)

Using scatter_matrix, we have plotted it against 3 columns.

Step 4 - Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down to the ipython file to look at the results.

We can see scatter matrix against 3 columns. Similarly, we can check for other columns to check similarity.

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Recommender System Machine Learning Project for Beginners-3
Content Based Recommender System Project - Building a Content-Based Product Recommender App with Streamlit

Mastering A/B Testing: A Practical Guide for Production
In this A/B Testing for Machine Learning Project, you will gain hands-on experience in conducting A/B tests, analyzing statistical significance, and understanding the challenges of building a solution for A/B testing in a production environment.

Build a CNN Model with PyTorch for Image Classification
In this deep learning project, you will learn how to build an Image Classification Model using PyTorch CNN

Classification Projects on Machine Learning for Beginners - 2
Learn to implement various ensemble techniques to predict license status for a given business.

Model Deployment on GCP using Streamlit for Resume Parsing
Perform model deployment on GCP for resume parsing model using Streamlit App.

Loan Eligibility Prediction Project using Machine learning on GCP
Loan Eligibility Prediction Project - Use SQL and Python to build a predictive model on GCP to determine whether an application requesting loan is eligible or not.

Deploy Transformer-BART Model on Paperspace Cloud
In this MLOps Project you will learn how to deploy a Tranaformer BART Model for Abstractive Text Summarization on Paperspace Private Cloud

MLOps using Azure Devops to Deploy a Classification Model
In this MLOps Azure project, you will learn how to deploy a classification machine learning model to predict the customer's license status on Azure through scalable CI/CD ML pipelines.

Time Series Project to Build a Multiple Linear Regression Model
Learn to build a Multiple linear regression model in Python on Time Series Data

Detectron2 Object Detection and Segmentation Example Python
Object Detection using Detectron2 - Build a Dectectron2 model to detect the zones and inhibitions in antibiogram images.