How to create Contingency tables in the StatsModels library?

This recipe describes how to create Contingency tables in the StatsModels library

Recipe Objective - How to create Contingency tables in the StatsModels library?

The statsmodels support a variety of approaches for analyzing contingency tables, including how to evaluate independence, symmetry, and uniformity, and how to work with collections of tables from a stratified population. The method described here applies primarily to bidirectional tables. Reusable tables can be analyzed with a log-linear model. There is currently no dedicated API for log-linear modeling in statsmodels, but you can use Poisson regression in statsmodels.genmod.GLM for this purpose.

Explore the Must Know Python Libraries for Data Science and Machine Learning.  

For more related projects -

https://www.dezyre.com/projects/data-science-projects/deep-learning-projects
https://www.dezyre.com/projects/data-science-projects/neural-network-projects

The statsmodels.stats.The table is the most basic class for working with contingency tables. You can create a Table object directly from an object like a rectangular array containing the cell numbers of a contingency table.

# Importing libraries
import statsmodels.api as sm
import pandas as pd

# Importing flchain dataset from survival package
X = sm.datasets.get_rdataset("flchain", "survival").data

# Crosstab of two features (chapter and sex)
crosstab = pd.crosstab(X['chapter'], X['sex'])

# Table
cont_table = sm.stats.Table(crosstab)
cont_table.table_orig

Output-
sex	F	M
chapter		
Blood	1	3
Circulatory	401	344
Congenital	0	3
Digestive	37	29
Endocrine	25	23
External Causes	35	31
Genitourinary	20	22
Ill Defined	25	13
Infectious	23	9
Injury and Poisoning	12	9
Mental	100	44
Musculoskeletal	11	3
Neoplasms	279	288
Nervous	73	57
Respiratory	121	124
Skin	2	2

In this way, we can create contingency tables in the StatsModels library.

What Users are saying..

profile image

Ed Godalle

Director Data Analytics at EY / EY Tech
linkedin profile url

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Linear Regression Model Project in Python for Beginners Part 2
Machine Learning Linear Regression Project for Beginners in Python to Build a Multiple Linear Regression Model on Soccer Player Dataset.

Azure Text Analytics for Medical Search Engine Deployment
Microsoft Azure Project - Use Azure text analytics cognitive service to deploy a machine learning model into Azure Databricks

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.

Learn to Build a Siamese Neural Network for Image Similarity
In this Deep Learning Project, you will learn how to build a siamese neural network with Keras and Tensorflow for Image Similarity.

PyTorch Project to Build a GAN Model on MNIST Dataset
In this deep learning project, you will learn how to build a GAN Model on MNIST Dataset for generating new images of handwritten digits.

MLOps Project on GCP using Kubeflow for Model Deployment
MLOps using Kubeflow on GCP - Build and deploy a deep learning model on Google Cloud Platform using Kubeflow pipelines in Python

NLP Project to Build a Resume Parser in Python using Spacy
Use the popular Spacy NLP python library for OCR and text classification to build a Resume Parser in Python.

Build a Multi ClassText Classification Model using Naive Bayes
Implement the Naive Bayes Algorithm to build a multi class text classification model in Python.

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Credit Card Fraud Detection as a Classification Problem
In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.