How to apply functions in a Group in a Pandas DataFrame?

This recipe helps you apply functions in a Group in a Pandas DataFrame
Last Updated: 02 Jun 2022

Get access to Data Science projects View all Data Science projects

DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective

Have you tried to apply a function on any dataset. One of the easiest way is to use apply function.

So this is the recipe on how we can apply functions in a Group in a Pandas DataFrame.

Recipe Objective

Step 1 - Import the library

import pandas as pd

We have imported pandas which will be needed for the dataset.

Step 2 - Setting up the Data

We have made a dataframe by using a dictionary. We have passed a dictionary with different values to create a dataframe. data = {"EmployeeGroup": ["A","A","A","A","A","A","B","B","B","B","B","C","C","C","C","C"], "Points": [10,40,50,70,50,50,60,10,40,50,60,70,40,60,40,60]} df = pd.DataFrame(data) print(" The Original DataFrame") print(df)

Step 3 - Training and Saving the model

We have used apply function to find Rolling Mean, Average, Sum, Maximum and Minimum. For this we have used lambda function on each and every values of the feature. print(" Rolling Mean:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.rolling(center=False,window=2).mean())) print(" Average:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.mean())) print(" Sum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.sum())) print(" Maximum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.max())) print(" Minimum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.min())) So the output comes as:

The Original DataFrame
   EmployeeGroup  Points
0              A      10
1              A      40
2              A      50
3              A      70
4              A      50
5              A      50
6              B      60
7              B      10
8              B      40
9              B      50
10             B      60
11             C      70
12             C      40
13             C      60
14             C      40
15             C      60

Rolling Mean:
0      NaN
1     25.0
2     45.0
3     60.0
4     60.0
5     50.0
6      NaN
7     35.0
8     25.0
9     45.0
10    55.0
11     NaN
12    55.0
13    50.0
14    50.0
15    50.0
Name: Points, dtype: float64

Average:
EmployeeGroup
A    45.0
B    44.0
C    54.0
Name: Points, dtype: float64

Sum:
EmployeeGroup
A    270
B    220
C    270
Name: Points, dtype: int64

Maximum:
EmployeeGroup
A    70
B    60
C    70
Name: Points, dtype: int64

Minimum:
EmployeeGroup
A    10
B    10
C    40
Name: Points, dtype: int64

Download Materials

iPython Notebook

What Users are saying..

Savvy Sahai

Data Science Intern, Capgemini

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Word2Vec and FastText Word Embedding with Gensim in Python

In this NLP Project, you will learn how to use the popular topic modelling library Gensim for implementing two state-of-the-art word embedding methods Word2Vec and FastText models.

View Project Details

Time Series Forecasting with LSTM Neural Network Python

Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

View Project Details

End-to-End Snowflake Healthcare Analytics Project on AWS-1

In this Snowflake Healthcare Analytics Project, you will leverage Snowflake on AWS to predict patient length of stay (LOS) in hospitals. The prediction of LOS can help in efficient resource allocation, lower the risk of staff/visitor infections, and improve overall hospital functioning.

View Project Details

How to apply functions in a Group in a Pandas DataFrame?

Recipe Objective

Table of Contents

Step 1 - Import the library

Step 2 - Setting up the Data

Step 3 - Training and Saving the model

Savvy Sahai

Relevant Projects

You might also like

Relevant Projects