How to apply functions in a Group in a Pandas DataFrame?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to apply functions in a Group in a Pandas DataFrame?

How to apply functions in a Group in a Pandas DataFrame?

This recipe helps you apply functions in a Group in a Pandas DataFrame

0

Recipe Objective

Have you tried to apply a function on any dataset. One of the easiest way is to use apply function.

So this is the recipe on how we can apply functions in a Group in a Pandas DataFrame.

Step 1 - Import the library

import pandas as pd

We have imported pandas which will be needed for the dataset.

Step 2 - Setting up the Data

We have made a dataframe by using a dictionary. We have passed a dictionary with different values to create a dataframe. data = {"EmployeeGroup": ["A","A","A","A","A","A","B","B","B","B","B","C","C","C","C","C"], "Points": [10,40,50,70,50,50,60,10,40,50,60,70,40,60,40,60]} df = pd.DataFrame(data) print(" The Original DataFrame") print(df)

Step 3 - Training and Saving the model

We have used apply function to find Rolling Mean, Average, Sum, Maximum and Minimum. For this we have used lambda function on each and every values of the feature. print(" Rolling Mean:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.rolling(center=False,window=2).mean())) print(" Average:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.mean())) print(" Sum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.sum())) print(" Maximum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.max())) print(" Minimum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.min())) So the output comes as:

The Original DataFrame
   EmployeeGroup  Points
0              A      10
1              A      40
2              A      50
3              A      70
4              A      50
5              A      50
6              B      60
7              B      10
8              B      40
9              B      50
10             B      60
11             C      70
12             C      40
13             C      60
14             C      40
15             C      60

Rolling Mean:
0      NaN
1     25.0
2     45.0
3     60.0
4     60.0
5     50.0
6      NaN
7     35.0
8     25.0
9     45.0
10    55.0
11     NaN
12    55.0
13    50.0
14    50.0
15    50.0
Name: Points, dtype: float64

Average:
EmployeeGroup
A    45.0
B    44.0
C    54.0
Name: Points, dtype: float64

Sum:
EmployeeGroup
A    270
B    220
C    270
Name: Points, dtype: int64

Maximum:
EmployeeGroup
A    70
B    60
C    70
Name: Points, dtype: int64

Minimum:
EmployeeGroup
A    10
B    10
C    40
Name: Points, dtype: int64

Relevant Projects

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Human Activity Recognition Using Smartphones Data Set
In this deep learning project, you will build a classification system where to precisely identify human fitness activities.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.