How to create crosstabs from a Dictionary in Python?
DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to create crosstabs from a Dictionary in Python?

How to create crosstabs from a Dictionary in Python?

This recipe helps you create crosstabs from a Dictionary in Python

0

Recipe Objective

Have you ever tried to do a cross features analysis for that you need to create crosstabs basis on different features.

So this is the recipe on we can create crosstabs from a Dictionary in Python.

Step 1 - Import the library

import pandas as pd

We have imported pandas which is needed.

Step 2 - Setting up the Data

We have created a dataset by making a dictionary with features and passing it through the dataframe function. raw_data = {"first_name": ["Sheldon", "Raj", "Leonard", "Howard", "Amy"], "last_name": ["Copper", "Koothrappali", "Hofstadter", "Wolowitz", "Fowler"], "age": [42, 38, 36, 41, 35], "Comedy_Score": [9, 7, 8, 8, 5], "Rating_Score": [25, 25, 49, 62, 70]} df = pd.DataFrame(raw_data, columns = ["first_name", "last_name", "age", "Comedy_Score", "Rating_Score"]) print(df)

Step 3 - Making CrossTab Table

For better understanding we are making different datasets with different number of features for crosstab table. First we have created for one feature that is first_name then in next for two and then for three features. df1 = pd.crosstab(df.first_name, df.age, margins=True) print(df1) df2 = pd.crosstab([df.age, df.Comedy_Score], df.first_name, margins=True) print(df2) df3 = pd.crosstab([df.age, df.Comedy_Score, df.Rating_Score], df.first_name, margins=True) print(df3) So the output comes as

  first_name     last_name  age  Comedy_Score  Rating_Score
0    Sheldon        Copper   42             9            25
1        Raj  Koothrappali   38             7            25
2    Leonard    Hofstadter   36             8            49
3     Howard      Wolowitz   41             8            62
4        Amy        Fowler   35             5            70

age         35  36  38  41  42  All
first_name                         
Amy          1   0   0   0   0    1
Howard       0   0   0   1   0    1
Leonard      0   1   0   0   0    1
Raj          0   0   1   0   0    1
Sheldon      0   0   0   0   1    1
All          1   1   1   1   1    5

first_name        Amy  Howard  Leonard  Raj  Sheldon  All
age Comedy_Score                                         
35  5               1       0        0    0        0    1
36  8               0       0        1    0        0    1
38  7               0       0        0    1        0    1
41  8               0       1        0    0        0    1
42  9               0       0        0    0        1    1
All                 1       1        1    1        1    5

first_name                     Amy  Howard  Leonard  Raj  Sheldon  All
age Comedy_Score Rating_Score                                         
35  5            70              1       0        0    0        0    1
36  8            49              0       0        1    0        0    1
38  7            25              0       0        0    1        0    1
41  8            62              0       1        0    0        0    1
42  9            25              0       0        0    0        1    1
All                              1       1        1    1        1    5

Relevant Projects

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Walmart Sales Forecasting Data Science Project
Data Science Project in R-Predict the sales for each department using historical markdown data from the Walmart dataset containing data of 45 Walmart stores.

Time Series Forecasting with LSTM Neural Network Python
Deep Learning Project- Learn to apply deep learning paradigm to forecast univariate time series data.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.