This recipe helps you aggregate using group by in pandas over multiple columns


Recipe Objective

So this recipe is a short example on how to aggregate using group by in pandas over multiple columns. Let's get started.

Step 1 - Import the library

import pandas as pd import seaborn as sb

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays. Seaborn is just used in here to import dataset.

Step 2 - Setup the Data

df = sb.load_dataset('tips') print(df.head())

Here we have imported tips dataset from seaborn library.

Step 3 - Aggregate using groupby

df=df.groupby(['sex','smoker','day','time','size']).sum() print(df)

Here we are groupby on certain columns and finally taking the sum of each identity of columns.

Step 4 - Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down to the ipython file to look at the results.

We can see the data being aggregated on specified columns.

