This recipe helps you summarise each column using dplyr package in R


Recipe Objective

Aggregation is one of the fundamental techniques in data manipulation that a data scientist should know. In R, we have dplyr package which is an add-on package most widely used to carry out data manipulation tasks. To carry out the task of aggregation, dplyr package provides us with group_by() function. We use summarise_each() and sumarise() function along with aggregation functions to summarise one or more than variable on the aggregated data by appplying functions like mean, min, max etc. ​

These functions take vectors as input and return a single numeric value after applying some in-built or user-defined functions on them.

Especifically summarise_each() function is used if we want to manipulate more than one variable by applying more than one function on each variable.

Syntax: summarise_each(x, funs(...) , ...)


  1. x = dataframe
  2. funs(...) = function to be applied on the variables specified after
  3. ... = variables to be manipulated

In this recipe, we will learn how to summarise each column using dplyr package in R. ​

Step 1: Loading the required library and Creating a DataFrame

Creating a STUDENT dataframe with Name and marks of two subjects in 3 Trimester exams. ​

# data manipulation library(dplyr) library(tidyverse) STUDENT = data.frame(Name = c("Ram","Ram", "Ram", "Shyam", "Shyam", "Shyam", "Jessica", "Jessica", "Jessica"), Science_Marks = c(55, 60, 65, 80, 70, 75, 45, 65, 70), Math_Marks = c(70, 75, 73, 50, 53, 55, 65, 78, 75), Trimester = c(1, 2, 3, 1, 2, 3, 1, 2, 3)) glimpse(STUDENT)
Rows: 9
Columns: 4
$ Name           Ram, Ram, Ram, Shyam, Shyam, Shyam, Jessica, Jessica,...
$ Science_Marks  55, 60, 65, 80, 70, 75, 45, 65, 70
$ Math_Marks     70, 75, 73, 50, 53, 55, 65, 78, 75
$ Trimester      1, 2, 3, 1, 2, 3, 1, 2, 3

Step 2: Application of summarise_each Function

summarise_each(STUDENT, funs(min,max), Science_Marks, Math_Marks)

Query: To find the minimum and maximum marks for Science and Math subjects (Trimester 1, 2 and 3) ​

summarise_each(STUDENT, funs(min,max), Science_Marks, Math_Marks)
Science_Marks_min	Math_Marks_min	Science_Marks_max	Math_Marks_max
45			50		80			78

