How to aggregate using group by in pandas over multiple columns?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

How to aggregate using group by in pandas over multiple columns?

How to aggregate using group by in pandas over multiple columns?

This recipe helps you aggregate using group by in pandas over multiple columns

Recipe Objective

Suppose we have a code in form of string and we want to run it replacing values of variables. It can be achieved via eval function.

So this recipe is a short example on how to aggregate using group by in pandas over multiple columns. Let's get started.

Step 1 - Import the library

import pandas as pd import seaborn as sb

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays. Seaborn is just used in here to import dataset.

Step 2 - Setup the Data

df = sb.load_dataset('tips') print(df.head())

Here we have imported tips dataset from seaborn library.

Step 3 - Aggregate using groupby

df=df.groupby(['sex','smoker','day','time','size']).sum() print(df)

Here we are groupby on certain columns and finally taking the sum of each identity of columns.

Step 4 - Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down to the ipython file to look at the results.

We can see the data being aggregated on specified columns.

Relevant Projects

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Churn Prediction in Telecom using Machine Learning in R
Estimating churners before they discontinue using a product or service is extremely important. In this ML project, you will develop a churn prediction model in telecom to predict customers who are most likely subject to churn.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Data Science Project in Python on BigMart Sales Prediction
The goal of this data science project is to build a predictive model and find out the sales of each product at a given Big Mart store.

Image Segmentation using Mask R-CNN with Tensorflow
In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Inventory Demand Forecasting using Machine Learning in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.