How to summarise each column using dplyr package in R?

This recipe helps you summarise each column using dplyr package in R

Recipe Objective

Aggregation is one of the fundamental techniques in data manipulation that a data scientist should know. In R, we have dplyr package which is an add-on package most widely used to carry out data manipulation tasks. To carry out the task of aggregation, dplyr package provides us with group_by() function. We use summarise_each() and sumarise() function along with aggregation functions to summarise one or more than variable on the aggregated data by appplying functions like mean, min, max etc. ​

These functions take vectors as input and return a single numeric value after applying some in-built or user-defined functions on them.

Especifically summarise_each() function is used if we want to manipulate more than one variable by applying more than one function on each variable.

Syntax: summarise_each(x, funs(...) , ...)

Where:

  1. x = dataframe
  2. funs(...) = function to be applied on the variables specified after
  3. ... = variables to be manipulated

In this recipe, we will learn how to summarise each column using dplyr package in R. ​

Step 1: Loading the required library and Creating a DataFrame

Creating a STUDENT dataframe with Name and marks of two subjects in 3 Trimester exams. ​

# data manipulation library(dplyr) library(tidyverse) STUDENT = data.frame(Name = c("Ram","Ram", "Ram", "Shyam", "Shyam", "Shyam", "Jessica", "Jessica", "Jessica"), Science_Marks = c(55, 60, 65, 80, 70, 75, 45, 65, 70), Math_Marks = c(70, 75, 73, 50, 53, 55, 65, 78, 75), Trimester = c(1, 2, 3, 1, 2, 3, 1, 2, 3)) glimpse(STUDENT)
Rows: 9
Columns: 4
$ Name           Ram, Ram, Ram, Shyam, Shyam, Shyam, Jessica, Jessica,...
$ Science_Marks  55, 60, 65, 80, 70, 75, 45, 65, 70
$ Math_Marks     70, 75, 73, 50, 53, 55, 65, 78, 75
$ Trimester      1, 2, 3, 1, 2, 3, 1, 2, 3

Step 2: Application of summarise_each Function

summarise_each(STUDENT, funs(min,max), Science_Marks, Math_Marks)

Query: To find the minimum and maximum marks for Science and Math subjects (Trimester 1, 2 and 3) ​

summarise_each(STUDENT, funs(min,max), Science_Marks, Math_Marks)
Science_Marks_min	Math_Marks_min	Science_Marks_max	Math_Marks_max
45			50		80			78

What Users are saying..

profile image

Jingwei Li

Graduate Research assistance at Stony Brook University
linkedin profile url

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

AWS MLOps Project to Deploy Multiple Linear Regression Model
Build and Deploy a Multiple Linear Regression Model in Python on AWS

NLP Project on LDA Topic Modelling Python using RACE Dataset
Use the RACE dataset to extract a dominant topic from each document and perform LDA topic modeling in python.

Loan Eligibility Prediction Project using Machine learning on GCP
Loan Eligibility Prediction Project - Use SQL and Python to build a predictive model on GCP to determine whether an application requesting loan is eligible or not.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

NLP Project to Build a Resume Parser in Python using Spacy
Use the popular Spacy NLP python library for OCR and text classification to build a Resume Parser in Python.

Many-to-One LSTM for Sentiment Analysis and Text Generation
In this LSTM Project , you will build develop a sentiment detection model using many-to-one LSTMs for accurate prediction of sentiment labels in airline text reviews. Additionally, we will also train many-to-one LSTMs on 'Alice's Adventures in Wonderland' to generate contextually relevant text.

PyCaret Project to Build and Deploy an ML App using Streamlit
In this PyCaret Project, you will build a customer segmentation model with PyCaret and deploy the machine learning application using Streamlit.

Build Classification Algorithms for Digital Transformation[Banking]
Implement a machine learning approach using various classification techniques in Python to examine the digitalisation process of bank customers.

OpenCV Project to Master Advanced Computer Vision Concepts
In this OpenCV project, you will learn to implement advanced computer vision concepts and algorithms in OpenCV library using Python.

FEAST Feature Store Example for Scaling Machine Learning
FEAST Feature Store Example- Learn to use FEAST Feature Store to manage, store, and discover features for customer churn prediction machine learning project.