How to find IQR of a column in R?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to find IQR of a column in R?

How to find IQR of a column in R?

This recipe helps you find IQR of a column in R

0

Recipe Objective

Exploratory Data Analysis is a crucial step before building any machine learning model on a dataset. This also includes gathering statistical inferences from the data. There are a few main terms in statistics which describes the variability of the numeric variable. These include IQR, quartiles, quantiles, mean and median. They help us to detect any outliers in the column and the distribution of the column.

This recipe focuses on finding IQR (Inter-Quartile Range) of a column.

IQR is the measure of spread in the mid 50% (Half) of the data. It is the difference between the first and third quartile.

Step 1: Loading required library and loading dataset

# Data manipulation package library(tidyverse) ​ # reading a dataset customer_seg = read.csv('R_67_Mall_Customers.csv') ​ glimpse(customer_seg)
Rows: 200
Columns: 5
$ CustomerID              1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1...
$ Gender                  Male, Male, Female, Female, Female, Female, ...
$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, ...
$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, ...
$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99,...

Dataset description: It is the basic data about the customers going to the supermarket mall. The variable that we interested in finding the IQR is Annual.Income which is in 1000s.

Step 2: Calculating IQR

We use the IQR() function to calculate the Interquartile range

IQR(customer_seg$Annual.Income..k..)
36.5

Note: this means that 36.5 is the mid 50% range of the dataset

Relevant Projects

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.