What is a cross tab function in pandas when is it used

This recipe explains what is a cross tab function in pandas and when is it used

Recipe Objective

Cross tab is used to compute a simple cross-tabulation of two (or more) factors. By default, it computes a frequency table of the factors unless an array of values and an aggregation function are passed.

So this recipe is a short example on What is a cross tab function and when is it used. Let's get started.

Step 1 - Import the library

import pandas as pd

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays.

Step 2 - Setup the Data

x = pd.Categorical(['a', 'b'], categories=['a', 'b', 'c']) y = pd.Categorical(['d', 'e'], categories=['d', 'e', 'f'])

Here we have two categorical dataset x and y.

Now, our dataset is ready.

Step 3 - Performing Cross tab

pd.crosstab(x, y)

Simply use crosstab function to perform the operation.

Step 4 - Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down the ipython file to visualize the final output.

Here 'c' and 'f' are not represented in the data and will not be shown in the output because dropna is True by default. Set dropna=False to preserve categories with no data.

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Azure Deep Learning-Deploy RNN CNN models for TimeSeries
In this Azure MLOps Project, you will learn to perform docker-based deployment of RNN and CNN Models for Time Series Forecasting on Azure Cloud.

Time Series Python Project using Greykite and Neural Prophet
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Avocado Machine Learning Project Python for Price Prediction
In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Isolation Forest Model and LOF for Anomaly Detection in Python
Credit Card Fraud Detection Project - Build an Isolation Forest Model and Local Outlier Factor (LOF) in Python to identify fraudulent credit card transactions.

PyTorch Project to Build a LSTM Text Classification Model
In this PyTorch Project you will learn how to build an LSTM Text Classification model for Classifying the Reviews of an App .

Llama2 Project for MetaData Generation using FAISS and RAGs
In this LLM Llama2 Project, you will automate metadata generation using Llama2, RAGs, and AWS to reduce manual efforts.

NLP Project on LDA Topic Modelling Python using RACE Dataset
Use the RACE dataset to extract a dominant topic from each document and perform LDA topic modeling in python.

Deep Learning Project for Beginners with Source Code Part 1
Learn to implement deep neural networks in Python .

Learn How to Build a Linear Regression Model in PyTorch
In this Machine Learning Project, you will learn how to build a simple linear regression model in PyTorch to predict the number of days subscribed.