How to draw a matrix of scatter plots using pandas?

How to draw a matrix of scatter plots using pandas?

How to draw a matrix of scatter plots using pandas?

This recipe helps you draw a matrix of scatter plots using pandas

Recipe Objective

Checking for collinearity among attributes of a dataset, is one of the most important steps in data preprocessing. A good way to understand the correlation among the features, is to create scatter plots for each pair of attributes.

So this recipe is a short example on How to draw a matrix of scatter plots using pandas. Let's get started.

Step 1 - Import the library

import pandas as pd import seaborn as sb

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays. Seaborn is just used in here to import dataset.

Step 2 - Setup the Data

df = sb.load_dataset('tips')

Here we have imported tips dataset from seaborn library.

Now our dataset is ready.

Step 3 - Plotting Scatter matrix

pd.plotting.scatter_matrix(df[['total_bill','tip','size']], alpha=0.2)

Using scatter_matrix, we have plotted it against 3 columns.

Step 4 - Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down to the ipython file to look at the results.

We can see scatter matrix against 3 columns. Similarly, we can check for other columns to check similarity.

Relevant Projects

Census Income Data Set Project - Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based on census data.

Customer Churn Prediction Analysis using Ensemble Techniques
In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

RASA NLU chatbot creation
The project will use rasa NLU for the Intent classifier, spacy for entity tagging, and mongo dB as the DB. The project will incorporate slot filling and context management and will be supporting the following intent and entities. Intents : product_info | ask_price|cancel_order Entities : product_name|location|order id The project will demonstrate how to generate data on the fly, annotate using framework and how to process those for different pieces of training as discussed above .

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Medical Image Segmentation Deep Learning Project
In this deep learning project, you will learn to implement Unet++ models for medical image segmentation to detect and classify colorectal polyps.

Churn Prediction in Telecom using Machine Learning in R
Estimating churners before they discontinue using a product or service is extremely important. In this ML project, you will develop a churn prediction model in telecom to predict customers who are most likely subject to churn.

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.