MACHINE LEARNING RECIPES
DATA CLEANING PYTHON
DATA MUNGING
PANDAS CHEATSHEET
ALL TAGS
# How to find mean and median of a column in R?

# How to find mean and median of a column in R?

This recipe helps you find mean and median of a column in R

Exploratory Data Analysis is a crucial step before building any machine learning model on a dataset. This also includes gathering statistical inferences from the data. There are a few main terms in stats which describes the variability of the numeric variable. These include IQR, quartiles, quantiles, mean and median. They help us to detect any outliers in the column and the distribution of the column.

This recipe focuses on finding mean and median of a column.

Mean and median gives the central tendency of the data. Mean is just the average of the values in the column divided by the total number of observations. Median is the value in the column which divides the dataset into two equal halves (i.e. the middle value).

```
# Data manipulation package
library(tidyverse)
# reading a dataset
customer_seg = read.csv('R_68_Mall_Customers.csv')
glimpse(customer_seg)
```

Rows: 200 Columns: 5 $ CustomerID1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1... $ Gender Male, Male, Female, Female, Female, Female, ... $ Age 19, 21, 20, 23, 31, 22, 35, 23, 64, 30, 67, ... $ Annual.Income..k.. 15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, ... $ Spending.Score..1.100. 39, 81, 6, 77, 40, 76, 6, 94, 3, 72, 14, 99,...

Dataset description: It is the basic data about the customers going to the supermarket mall. The variable that we interested in is Annual.Income which is in 1000s

We use the mean() function to calculate the mean of the column

```
mean(customer_seg$Annual.Income..k..)
```

60.56

We use the median() function to calculate the median of the column

```
median(customer_seg$Annual.Income..k..)
```

61.5

We use the summary() function to calculate the mean, median and other statistical terms of the column

```
summary(customer_seg$Annual.Income..k..)
```

Min. 1st Qu. Median Mean 3rd Qu. Max. 15.00 41.50 61.50 60.56 78.00 137.00

This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Use the Adult Income dataset to predict whether income exceeds 50K yr based on
census data.