How to apply functions in a Group in a Pandas DataFrame?

How to apply functions in a Group in a Pandas DataFrame?

How to apply functions in a Group in a Pandas DataFrame?

This recipe helps you apply functions in a Group in a Pandas DataFrame

Recipe Objective

Have you tried to apply a function on any dataset. One of the easiest way is to use apply function.

So this is the recipe on how we can apply functions in a Group in a Pandas DataFrame.

Step 1 - Import the library

import pandas as pd

We have imported pandas which will be needed for the dataset.

Step 2 - Setting up the Data

We have made a dataframe by using a dictionary. We have passed a dictionary with different values to create a dataframe. data = {"EmployeeGroup": ["A","A","A","A","A","A","B","B","B","B","B","C","C","C","C","C"], "Points": [10,40,50,70,50,50,60,10,40,50,60,70,40,60,40,60]} df = pd.DataFrame(data) print(" The Original DataFrame") print(df)

Step 3 - Training and Saving the model

We have used apply function to find Rolling Mean, Average, Sum, Maximum and Minimum. For this we have used lambda function on each and every values of the feature. print(" Rolling Mean:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.rolling(center=False,window=2).mean())) print(" Average:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.mean())) print(" Sum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.sum())) print(" Maximum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.max())) print(" Minimum:"); print(df.groupby("EmployeeGroup")["Points"].apply(lambda x:x.min())) So the output comes as:

The Original DataFrame
   EmployeeGroup  Points
0              A      10
1              A      40
2              A      50
3              A      70
4              A      50
5              A      50
6              B      60
7              B      10
8              B      40
9              B      50
10             B      60
11             C      70
12             C      40
13             C      60
14             C      40
15             C      60

Rolling Mean:
0      NaN
1     25.0
2     45.0
3     60.0
4     60.0
5     50.0
6      NaN
7     35.0
8     25.0
9     45.0
10    55.0
11     NaN
12    55.0
13    50.0
14    50.0
15    50.0
Name: Points, dtype: float64

A    45.0
B    44.0
C    54.0
Name: Points, dtype: float64

A    270
B    220
C    270
Name: Points, dtype: int64

A    70
B    60
C    70
Name: Points, dtype: int64

A    10
B    10
C    40
Name: Points, dtype: int64

Download Materials

Relevant Projects

Digit Recognition using CNN for MNIST Dataset in Python
In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

Ola Bike Rides Request Demand Forecast
Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

Word2Vec and FastText Word Embedding with Gensim in Python
In this NLP Project, you will learn how to use the popular topic modelling library Gensim for implementing two state-of-the-art word embedding methods Word2Vec and FastText models.

Time Series Analysis Project in R on Stock Market forecasting
In this time series project, you will build a model to predict the stock prices and identify the best time series forecasting model that gives reliable and authentic results for decision making.

Census Income Data Set Project - Predict Adult Census Income
Use the Adult Income dataset to predict whether income exceeds 50K yr based on census data.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.