How to create a scatter plot using plotly in R?

How to create a scatter plot using plotly in R?

How to create a scatter plot using plotly in R?

This recipe helps you create a scatter plot using plotly in R


Recipe Objective

Scatter plot is the simplest chart which uses cartesian coordinates to display the relation between two variables x and y. It is used to find any trend or relationship between the two variable. ​

In this recipe we are going to use Plotly package to plot the required scatter plot. Plotly package provides an interface to the plotly javascript library allowing us to create interactive web-based graphics entrirely in R. Plots created by plotly works in multiple format such as: ​

  1. R Markdown Documents
  2. Shiny apps - deploying on the web
  3. Windows viewer

Plotly has been actively developed and supported by it's community. ​

This recipe demonstrates how to plot a simple scatter plot in R using plotly package. ​

STEP 1: Loading required library and dataset

Dataset description: It is the basic data about the customers going to the supermarket mall. The variables that we are interested in: Annual.Income (which is in 1000s) , Spending Score and Age

# Data manipulation package library(dplyr) library(tidyverse) # reading a dataset customer_seg = read.csv('R_123_Mall_Customers.csv') # selecting the required variables using the select() function customer_seg_var = select(customer_seg, Age, Annual.Income..k..,Spending.Score..1.100.) # summary of the selected variables glimpse(customer_seg_var)
Observations: 200
Variables: 3
$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, 35…
$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, 19…
$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99, 1…

STEP 2: Plotting a scatter plot using Plotly

We use the plot_ly() function to plot a scatter plot between annual income and spending score variables.

Syntax: plot_ly( data = , x = , y = , color = )


  1. x = variable to be plotted in x axis
  2. y = variable to be plotted in y axis
  3. data = dataframe to be used
  4. color = group the points based on a factor variable using colour


  1. The %>% sign in the syntax earlier makes the code more readable and enables R to read further code without breaking it.
  2. We also use layout() function to give a title to the graph
fig <- plot_ly(x = ~Annual.Income..k.., y = ~Spending.Score..1.100., data = customer_seg, color = ~Gender) %>% layout(title = 'Scatter Plot using Plotly') embed_notebook(fig)

Relevant Projects

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction
In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Machine Learning Project to Forecast Rossmann Store Sales
In this machine learning project you will work on creating a robust prediction model of Rossmann's daily sales using store, promotion, and competitor data.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.