How to draw a matrix of scatter plots using pandas?

How to draw a matrix of scatter plots using pandas?

How to draw a matrix of scatter plots using pandas?

This recipe helps you draw a matrix of scatter plots using pandas


Recipe Objective

Checking for collinearity among attributes of a dataset, is one of the most important steps in data preprocessing. A good way to understand the correlation among the features, is to create scatter plots for each pair of attributes.

So this recipe is a short example on How to draw a matrix of scatter plots using pandas. Let's get started.

Step 1 - Import the library

import pandas as pd import seaborn as sb

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays. Seaborn is just used in here to import dataset.

Step 2 - Setup the Data

df = sb.load_dataset('tips')

Here we have imported tips dataset from seaborn library.

Now our dataset is ready.

Step 3 - Plotting Scatter matrix

pd.plotting.scatter_matrix(df[['total_bill','tip','size']], alpha=0.2)

Using scatter_matrix, we have plotted it against 3 columns.

Step 4 - Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down to the ipython file to look at the results.

We can see scatter matrix against 3 columns. Similarly, we can check for other columns to check similarity.

Relevant Projects

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Sequence Classification with LSTM RNN in Python with Keras
In this project, we are going to work on Sequence to Sequence Prediction using IMDB Movie Review Dataset​ using Keras in Python.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.