What is a cross tab function when is it used?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

What is a cross tab function when is it used?

What is a cross tab function when is it used?

This recipe explains what is a cross tab function when is it used

Recipe Objective

Cross tab is used to compute a simple cross-tabulation of two (or more) factors. By default, it computes a frequency table of the factors unless an array of values and an aggregation function are passed.

So this recipe is a short example on What is a cross tab function and when is it used. Let's get started.

Step 1 - Import the library

import pandas as pd

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays.

Step 2 - Setup the Data

x = pd.Categorical(['a', 'b'], categories=['a', 'b', 'c']) y = pd.Categorical(['d', 'e'], categories=['d', 'e', 'f'])

Here we have two categorical dataset x and y.

Now, our dataset is ready.

Step 3 - Performing Cross tab

pd.crosstab(x, y)

Simply use crosstab function to perform the operation.

Step 4 - Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down the ipython file to visualize the final output.

Here 'c' and 'f' are not represented in the data and will not be shown in the output because dropna is True by default. Set dropna=False to preserve categories with no data.

Relevant Projects

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Ola Bike Rides Request Demand Forecast
Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Time Series Analysis Project in R on Stock Market forecasting
In this time series project, you will build a model to predict the stock prices and identify the best time series forecasting model that gives reliable and authentic results for decision making.

Deep dive into BERT & transformers - Part 1
In this project, we will cover in detail the architecture of a transformer used in natural language processing use cases. We will go through the key nlp areas in the pre-transformer stage like bow, word2vec...and then the origin and gradual refinement of transformers. Finally, we will study one of the most popular state of the art transformer models, called BERT and use it for text classification on a large dataset.

House Price Prediction Project using Machine Learning
Use the Zillow dataset to follow a test-driven approach and build a regression machine learning model to predict the price of the house based on other variables.

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Time Series Python Project using Greykite and Neural Prophet
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.