How to find mean and median of a column in R?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to find mean and median of a column in R?

How to find mean and median of a column in R?

This recipe helps you find mean and median of a column in R

0

Recipe Objective

Exploratory Data Analysis is a crucial step before building any machine learning model on a dataset. This also includes gathering statistical inferences from the data. There are a few main terms in stats which describes the variability of the numeric variable. These include IQR, quartiles, quantiles, mean and median. They help us to detect any outliers in the column and the distribution of the column.

This recipe focuses on finding mean and median of a column.

Mean and median gives the central tendency of the data. Mean is just the average of the values in the column divided by the total number of observations. Median is the value in the column which divides the dataset into two equal halves (i.e. the middle value).

Step 1: Importing libraries and loading dataset

# Data manipulation package library(tidyverse) ​ # reading a dataset customer_seg = read.csv('R_68_Mall_Customers.csv') ​ glimpse(customer_seg)
Rows: 200
Columns: 5
$ CustomerID              1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1...
$ Gender                  Male, Male, Female, Female, Female, Female, ...
$ Age                     19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, ...
$ Annual.Income..k..      15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, ...
$ Spending.Score..1.100.  39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99,...

Dataset description: It is the basic data about the customers going to the supermarket mall. The variable that we interested in is Annual.Income which is in 1000s

Step 2: Calculating mean

We use the mean() function to calculate the mean of the column

mean(customer_seg$Annual.Income..k..)
60.56

Step 3: Calculating median

We use the median() function to calculate the median of the column

median(customer_seg$Annual.Income..k..)
61.5

Step 4: Calculating median and mean together

We use the summary() function to calculate the mean, median and other statistical terms of the column

summary(customer_seg$Annual.Income..k..)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  15.00   41.50   61.50   60.56   78.00  137.00

Relevant Projects

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Zillow’s Home Value Prediction (Zestimate)
Data Science Project in R -Build a machine learning algorithm to predict the future sale prices of homes.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.