How to create a boxplot using plotly in R?

This recipe helps you create a boxplot using plotly in R

Recipe Objective

Box-plots are also known as box-whisker plots. They are a special type of plot which showcases complex numerical data in a compact manner. It is more informative than a strip chart as it also displays the interquartile range, median and min-max range. It is mainly used to find outliers in the dataset. ​

In this recipe we are going to use Plotly package to plot the required box plot. Plotly package provides an interface to the plotly javascript library allowing us to create interactive web-based graphics entrirely in R. Plots created by plotly works in multiple format such as: ​

Explore Interesting IoT Project Ideas for Practice

  1. R Markdown Documents
  2. Shiny apps - deploying on the web
  3. Windows viewer

Plotly has been actively developed and supported by it's community. ​

This recipe demonstrates how to plot a box plot in R using plotly package. ​

STEP 1: Loading required library and dataset

Dataset description: It is the basic data about the customers going to the supermarket mall. The variables that we are interested in: Annual.Income (which is in 1000s) , Spending Score and Age

# Data manipulation package library(dplyr) library(tidyverse) # reading a dataset customer_seg = read.csv('R_131_Mall_Customers.csv') # selecting the required variables using the select() function customer_seg_var = select(customer_seg, Age, Annual.Income..k..,Spending.Score..1.100.) # summary of the selected variables glimpse(customer_seg_var)

Observations: 200
Variables: 3
$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, 35…
$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, 19…
$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99, 1…

STEP 2: Plotting a box plot using Plotly

We use the plot_ly() function to plot a box plot between of Annual Income based on Gender.

Syntax: plot_ly( data = , x = , y = , type = "box")

Where:

  1. x = variable to be plotted in x axis
  2. y = variable to be plotted in y axis
  3. data = dataframe to be used
  4. type = type of the chart

Note:

  1. The %>% sign in the syntax earlier makes the code more readable and enables R to read further code without breaking it.
  2. We also use layout() function to give a title to the graph

fig <- plot_ly(x = ~Gender, y = ~Annual.Income..k.., data = customer_seg, type = "box") %>% layout(title = 'Box Plot using Plotly') embed_notebook(fig)

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Time Series Classification Project for Elevator Failure Prediction
In this Time Series Project, you will predict the failure of elevators using IoT sensor data as a time series classification machine learning problem.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

House Price Prediction Project using Machine Learning in Python
Use the Zillow Zestimate Dataset to build a machine learning model for house price prediction.

Build a Multi ClassText Classification Model using Naive Bayes
Implement the Naive Bayes Algorithm to build a multi class text classification model in Python.

Loan Default Prediction Project using Explainable AI ML Models
Loan Default Prediction Project that employs sophisticated machine learning models, such as XGBoost and Random Forest and delves deep into the realm of Explainable AI, ensuring every prediction is transparent and understandable.

NLP Project for Beginners on Text Processing and Classification
This Project Explains the Basic Text Preprocessing and How to Build a Classification Model in Python

Learn to Build an End-to-End Machine Learning Pipeline - Part 2
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, incorporating Hopsworks' feature store and Weights and Biases for model experimentation.

Build ARCH and GARCH Models in Time Series using Python
In this Project we will build an ARCH and a GARCH model using Python

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Loan Eligibility Prediction Project using Machine learning on GCP
Loan Eligibility Prediction Project - Use SQL and Python to build a predictive model on GCP to determine whether an application requesting loan is eligible or not.