How to summarise each column using dplyr package in R?

This recipe helps you summarise each column using dplyr package in R

Recipe Objective

Aggregation is one of the fundamental techniques in data manipulation that a data scientist should know. In R, we have dplyr package which is an add-on package most widely used to carry out data manipulation tasks. To carry out the task of aggregation, dplyr package provides us with group_by() function. We use summarise_each() and sumarise() function along with aggregation functions to summarise one or more than variable on the aggregated data by appplying functions like mean, min, max etc. ​

These functions take vectors as input and return a single numeric value after applying some in-built or user-defined functions on them.

Especifically summarise_each() function is used if we want to manipulate more than one variable by applying more than one function on each variable.

Syntax: summarise_each(x, funs(...) , ...)

Where:

  1. x = dataframe
  2. funs(...) = function to be applied on the variables specified after
  3. ... = variables to be manipulated

In this recipe, we will learn how to summarise each column using dplyr package in R. ​

Step 1: Loading the required library and Creating a DataFrame

Creating a STUDENT dataframe with Name and marks of two subjects in 3 Trimester exams. ​

# data manipulation library(dplyr) library(tidyverse) STUDENT = data.frame(Name = c("Ram","Ram", "Ram", "Shyam", "Shyam", "Shyam", "Jessica", "Jessica", "Jessica"), Science_Marks = c(55, 60, 65, 80, 70, 75, 45, 65, 70), Math_Marks = c(70, 75, 73, 50, 53, 55, 65, 78, 75), Trimester = c(1, 2, 3, 1, 2, 3, 1, 2, 3)) glimpse(STUDENT)
Rows: 9
Columns: 4
$ Name           Ram, Ram, Ram, Shyam, Shyam, Shyam, Jessica, Jessica,...
$ Science_Marks  55, 60, 65, 80, 70, 75, 45, 65, 70
$ Math_Marks     70, 75, 73, 50, 53, 55, 65, 78, 75
$ Trimester      1, 2, 3, 1, 2, 3, 1, 2, 3

Step 2: Application of summarise_each Function

summarise_each(STUDENT, funs(min,max), Science_Marks, Math_Marks)

Query: To find the minimum and maximum marks for Science and Math subjects (Trimester 1, 2 and 3) ​

summarise_each(STUDENT, funs(min,max), Science_Marks, Math_Marks)
Science_Marks_min	Math_Marks_min	Science_Marks_max	Math_Marks_max
45			50		80			78

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

End-to-End Speech Emotion Recognition Project using ANN
Speech Emotion Recognition using RAVDESS Audio Dataset - Build an Artificial Neural Network Model to Classify Audio Data into various Emotions like Sad, Happy, Angry, and Neutral

Many-to-One LSTM for Sentiment Analysis and Text Generation
In this LSTM Project , you will build develop a sentiment detection model using many-to-one LSTMs for accurate prediction of sentiment labels in airline text reviews. Additionally, we will also train many-to-one LSTMs on 'Alice's Adventures in Wonderland' to generate contextually relevant text.

MLOps Project on GCP using Kubeflow for Model Deployment
MLOps using Kubeflow on GCP - Build and deploy a deep learning model on Google Cloud Platform using Kubeflow pipelines in Python

Learn to Build a Siamese Neural Network for Image Similarity
In this Deep Learning Project, you will learn how to build a siamese neural network with Keras and Tensorflow for Image Similarity.

Build Customer Propensity to Purchase Model in Python
In this machine learning project, you will learn to build a machine learning model to estimate customer propensity to purchase.

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

BigMart Sales Prediction ML Project in Python
The goal of the BigMart Sales Prediction ML project is to build and evaluate different predictive models and determine the sales of each product at a store.

Time Series Analysis with Facebook Prophet Python and Cesium
Time Series Analysis Project - Use the Facebook Prophet and Cesium Open Source Library for Time Series Forecasting in Python

Stock Price Prediction Project using LSTM and RNN
Learn how to predict stock prices using RNN and LSTM models. Understand deep learning concepts and apply them to real-world financial data for accurate forecasting.

Azure Deep Learning-Deploy RNN CNN models for TimeSeries
In this Azure MLOps Project, you will learn to perform docker-based deployment of RNN and CNN Models for Time Series Forecasting on Azure Cloud.