How to find IQR of a column in R?

How to find IQR of a column in R?

How to find IQR of a column in R?

This recipe helps you find IQR of a column in R


Recipe Objective

Exploratory Data Analysis is a crucial step before building any machine learning model on a dataset. This also includes gathering statistical inferences from the data. There are a few main terms in statistics which describes the variability of the numeric variable. These include IQR, quartiles, quantiles, mean and median. They help us to detect any outliers in the column and the distribution of the column.

This recipe focuses on finding IQR (Inter-Quartile Range) of a column.

IQR is the measure of spread in the mid 50% (Half) of the data. It is the difference between the first and third quartile.

Step 1: Loading required library and loading dataset

# Data manipulation package library(tidyverse) ​ # reading a dataset customer_seg = read.csv('R_67_Mall_Customers.csv') ​ glimpse(customer_seg)
Rows: 200
Columns: 5
$ CustomerID              1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1...
$ Gender                  Male, Male, Female, Female, Female, Female, ...
$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, ...
$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, ...
$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99,...

Dataset description: It is the basic data about the customers going to the supermarket mall. The variable that we interested in finding the IQR is Annual.Income which is in 1000s.

Step 2: Calculating IQR

We use the IQR() function to calculate the Interquartile range


Note: this means that 36.5 is the mid 50% range of the dataset

Relevant Projects

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.