How to create a violin plot using plotly in R?

This recipe helps you create a violin plot using plotly in R

Recipe Objective

Violin plots are similar to boxplots which showcases the probability density along with interquartile, median and range at different values. They are more informative than boxplots which are used to showcase the full distribution of the data. They are also known to combine the features of histogram and boxplots. They are mainly used to compare the distribution of different variables/columns in the dataset. ​

In this recipe we are going to use Plotly package to plot the required violin plot. Plotly package provides an interface to the plotly javascript library allowing us to create interactive web-based graphics entrirely in R. Plots created by plotly works in multiple format such as: ​

  1. R Markdown Documents
  2. Shiny apps - deploying on the web
  3. Windows viewer

Plotly has been actively developed and supported by it's community. ​

This recipe demonstrates how to plot a violin plot in R using plotly package. ​

STEP 1: Loading required library and dataset

Dataset description: It is the basic data about the customers going to the supermarket mall. The variable that we are interested in: Annual.Income (which is in 1000s), Spending Score and age

# Data manipulation package library(dplyr) library(tidyverse) # reading a dataset customer_seg = read.csv('R_132_Mall_Customers.csv') # selecting the required variables using the select() function customer_seg_var = select(customer_seg, Age, Annual.Income..k..,Spending.Score..1.100.) # summary of the selected variables glimpse(customer_seg_var)
Observations: 200
Variables: 3
$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, 35…
$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, 19…
$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99, 1…

STEP 2: Plotting a Violin plot using Plotly

We use the plot_ly() function to plot a box plot between of Annual Income based on Gender.

Syntax: plot_ly( data = , x = , y = , type = "violin")

Where:

  1. x = variable to be plotted in x axis
  2. y = variable to be plotted in y axis
  3. data = dataframe to be used
  4. type = type of the chart

Note:

  1. The %>% sign in the syntax earlier makes the code more readable and enables R to read further code without breaking it.
  2. We also use layout() function to give a title to the graph
fig <- plot_ly(x = ~Gender, y = ~Annual.Income..k.., data = customer_seg, type = "violin", # to also plot box plot whiskers on top of the violin plot box = list(visible = T)) %>% layout(title = 'Violin Plot using Plotly') embed_notebook(fig)

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Langchain Project for Customer Support App in Python
In this LLM Project, you will learn how to enhance customer support interactions through Large Language Models (LLMs), enabling intelligent, context-aware responses. This Langchain project aims to seamlessly integrate LLM technology with databases, PDF knowledge bases, and audio processing agents to create a comprehensive customer support application.

Recommender System Machine Learning Project for Beginners-3
Content Based Recommender System Project - Building a Content-Based Product Recommender App with Streamlit

Build a Autoregressive and Moving Average Time Series Model
In this time series project, you will learn to build Autoregressive and Moving Average Time Series Models to forecast future readings, optimize performance, and harness the power of predictive analytics for sensor data.

Azure Text Analytics for Medical Search Engine Deployment
Microsoft Azure Project - Use Azure text analytics cognitive service to deploy a machine learning model into Azure Databricks

Census Income Data Set Project-Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based oncensus data.

Build CI/CD Pipeline for Machine Learning Projects using Jenkins
In this project, you will learn how to create a CI/CD pipeline for a search engine application using Jenkins.

Build a Logistic Regression Model in Python from Scratch
Regression project to implement logistic regression in python from scratch on streaming app data.

Build a Hybrid Recommender System in Python using LightFM
In this Recommender System project, you will build a hybrid recommender system in Python using LightFM .

Credit Card Default Prediction using Machine learning techniques
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.