How to summarise each column using dplyr package in R?

This recipe helps you summarise each column using dplyr package in R
Last Updated: 06 May 2021

Get access to Data Science projects View all Data Science projects

MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET ALL TAGS

Recipe Objective

Aggregation is one of the fundamental techniques in data manipulation that a data scientist should know. In R, we have dplyr package which is an add-on package most widely used to carry out data manipulation tasks. To carry out the task of aggregation, dplyr package provides us with group_by() function. We use summarise_each() and sumarise() function along with aggregation functions to summarise one or more than variable on the aggregated data by appplying functions like mean, min, max etc.

These functions take vectors as input and return a single numeric value after applying some in-built or user-defined functions on them.

Especifically summarise_each() function is used if we want to manipulate more than one variable by applying more than one function on each variable.

Syntax: summarise_each(x, funs(...) , ...)

Where:

x = dataframe
funs(...) = function to be applied on the variables specified after
... = variables to be manipulated

In this recipe, we will learn how to summarise each column using dplyr package in R.

Step 1: Loading the required library and Creating a DataFrame

Creating a STUDENT dataframe with Name and marks of two subjects in 3 Trimester exams.


# data manipulation
library(dplyr)
library(tidyverse)


STUDENT = data.frame(Name = c("Ram","Ram", "Ram", "Shyam", "Shyam", "Shyam", "Jessica", "Jessica", "Jessica"),
                     Science_Marks = c(55, 60, 65, 80, 70, 75, 45, 65, 70),
                     Math_Marks = c(70, 75, 73, 50, 53, 55, 65, 78, 75),
                     Trimester = c(1, 2, 3, 1, 2, 3, 1, 2, 3))

glimpse(STUDENT)

Rows: 9
Columns: 4
$ Name           Ram, Ram, Ram, Shyam, Shyam, Shyam, Jessica, Jessica,...
$ Science_Marks  55, 60, 65, 80, 70, 75, 45, 65, 70
$ Math_Marks     70, 75, 73, 50, 53, 55, 65, 78, 75
$ Trimester      1, 2, 3, 1, 2, 3, 1, 2, 3

Step 2: Application of summarise_each Function


summarise_each(STUDENT, funs(min,max), Science_Marks, Math_Marks)

Query: To find the minimum and maximum marks for Science and Math subjects (Trimester 1, 2 and 3)


summarise_each(STUDENT, funs(min,max), Science_Marks, Math_Marks)

Science_Marks_min	Math_Marks_min	Science_Marks_max	Math_Marks_max
45			50		80			78

What Users are saying..

Jingwei Li

Graduate Research assistance at Stony Brook University

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

AWS MLOps Project to Deploy Multiple Linear Regression Model

Build and Deploy a Multiple Linear Regression Model in Python on AWS

View Project Details

NLP Project on LDA Topic Modelling Python using RACE Dataset

Use the RACE dataset to extract a dominant topic from each document and perform LDA topic modeling in python.

View Project Details

Loan Eligibility Prediction Project using Machine learning on GCP

Loan Eligibility Prediction Project - Use SQL and Python to build a predictive model on GCP to determine whether an application requesting loan is eligible or not.

View Project Details

Loan Eligibility Prediction using Gradient Boosting Classifier

This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

View Project Details

NLP Project to Build a Resume Parser in Python using Spacy

Use the popular Spacy NLP python library for OCR and text classification to build a Resume Parser in Python.

View Project Details

Many-to-One LSTM for Sentiment Analysis and Text Generation

In this LSTM Project , you will build develop a sentiment detection model using many-to-one LSTMs for accurate prediction of sentiment labels in airline text reviews. Additionally, we will also train many-to-one LSTMs on 'Alice's Adventures in Wonderland' to generate contextually relevant text.

View Project Details

How to summarise each column using dplyr package in R?

Recipe Objective

Step 1: Loading the required library and Creating a DataFrame

Step 2: Application of summarise_each Function

Jingwei Li

Relevant Projects

You might also like

Relevant Projects