How to present Hierarchical Data in Pandas?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to present Hierarchical Data in Pandas?

How to present Hierarchical Data in Pandas?

This recipe helps you present Hierarchical Data in Pandas

0
This data science python source code does the following: 1. Creates your own data dictionary. 2. Conversion of dictionary into dataframe. 3. Filtering data in hierarchial manner. 4. Swapping the index. 5. Performs statistical analysis.
In [1]:
## How to present Hierarchical Data in Pandas
def Kickstarter_Example_90():
    print()
    print(format('How to present Hierarchical Data in Pandas','*^82'))

    import warnings
    warnings.filterwarnings("ignore")

    # load libraries
    import pandas as pd

    # Create dataframe
    raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks',
                             'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts',
                             'Scouts', 'Scouts', 'Scouts'],
                'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd',
                            '2nd','1st', '1st', '2nd', '2nd'],
                'name': ['Miller', 'Jacobson', 'Bali', 'Milner', 'Cooze', 'Jacon',
                         'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'],
                'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
                'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}
    df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'name',
                                           'preTestScore', 'postTestScore'])
    print(); print(df)

    # Set the hierarchical index but leave the columns inplace
    df.set_index(['regiment', 'company'], drop=False)
    print(); print(df)

    # Set the hierarchical index to be by regiment, and then by company
    df = df.set_index(['regiment', 'company'])
    print(); print(df)

    # View the index
    print(); print(df.index)

    # Swap the levels in the index
    print(); print(df.swaplevel('regiment', 'company'))

    # Summarize the results by regiment
    print(); print(df.sum(level='regiment'))
    print(); print(df.count(level='regiment'))
    print(); print(df.mean(level='regiment'))
    print(); print(df.max(level='regiment'))
    print(); print(df.min(level='regiment'))

Kickstarter_Example_90()
********************How to present Hierarchical Data in Pandas********************

      regiment company      name  preTestScore  postTestScore
0   Nighthawks     1st    Miller             4             25
1   Nighthawks     1st  Jacobson            24             94
2   Nighthawks     2nd      Bali            31             57
3   Nighthawks     2nd    Milner             2             62
4     Dragoons     1st     Cooze             3             70
5     Dragoons     1st     Jacon             4             25
6     Dragoons     2nd    Ryaner            24             94
7     Dragoons     2nd      Sone            31             57
8       Scouts     1st     Sloan             2             62
9       Scouts     1st     Piger             3             70
10      Scouts     2nd     Riani             2             62
11      Scouts     2nd       Ali             3             70

      regiment company      name  preTestScore  postTestScore
0   Nighthawks     1st    Miller             4             25
1   Nighthawks     1st  Jacobson            24             94
2   Nighthawks     2nd      Bali            31             57
3   Nighthawks     2nd    Milner             2             62
4     Dragoons     1st     Cooze             3             70
5     Dragoons     1st     Jacon             4             25
6     Dragoons     2nd    Ryaner            24             94
7     Dragoons     2nd      Sone            31             57
8       Scouts     1st     Sloan             2             62
9       Scouts     1st     Piger             3             70
10      Scouts     2nd     Riani             2             62
11      Scouts     2nd       Ali             3             70

                        name  preTestScore  postTestScore
regiment   company
Nighthawks 1st        Miller             4             25
           1st      Jacobson            24             94
           2nd          Bali            31             57
           2nd        Milner             2             62
Dragoons   1st         Cooze             3             70
           1st         Jacon             4             25
           2nd        Ryaner            24             94
           2nd          Sone            31             57
Scouts     1st         Sloan             2             62
           1st         Piger             3             70
           2nd         Riani             2             62
           2nd           Ali             3             70

MultiIndex(levels=[['Dragoons', 'Nighthawks', 'Scouts'], ['1st', '2nd']],
           labels=[[1, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 2], [0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1]],
           names=['regiment', 'company'])

                        name  preTestScore  postTestScore
company regiment
1st     Nighthawks    Miller             4             25
        Nighthawks  Jacobson            24             94
2nd     Nighthawks      Bali            31             57
        Nighthawks    Milner             2             62
1st     Dragoons       Cooze             3             70
        Dragoons       Jacon             4             25
2nd     Dragoons      Ryaner            24             94
        Dragoons        Sone            31             57
1st     Scouts         Sloan             2             62
        Scouts         Piger             3             70
2nd     Scouts         Riani             2             62
        Scouts           Ali             3             70

            preTestScore  postTestScore
regiment
Nighthawks            61            238
Dragoons              62            246
Scouts                10            264

            name  preTestScore  postTestScore
regiment
Dragoons       4             4              4
Nighthawks     4             4              4
Scouts         4             4              4

            preTestScore  postTestScore
regiment
Nighthawks         15.25           59.5
Dragoons           15.50           61.5
Scouts              2.50           66.0

              name  preTestScore  postTestScore
regiment
Nighthawks  Milner            31             94
Dragoons      Sone            31             94
Scouts       Sloan             3             70

             name  preTestScore  postTestScore
regiment
Nighthawks   Bali             2             25
Dragoons    Cooze             3             25
Scouts        Ali             2             62

Relevant Projects

Mercari Price Suggestion Challenge Data Science Project
Data Science Project in Python- Build a machine learning algorithm that automatically suggests the right product prices.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Choosing the right Time Series Forecasting Methods
There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.