Generalized estimating equations in the StatsModels library

In this recipe, you will learn about Generalized Estimating Equations in StatsModels.

Recipe Objective - What are Generalized Estimating Equations in the StatsModels library?

Generalized estimation equations estimate a generalized linear model of panels, clusters, or iterative measurement data when observations may correlate to clusters but not between clusters. Supports the estimation of the same single-parameter exponential family as the generalized linear model (GLM).

For more related projects -

https://www.dezyre.com/projects/data-science-projects/deep-learning-projects
https://www.dezyre.com/projects/data-science-projects/neural-network-projects

Example:

# Importing libraries
import statsmodels.api as sm
import statsmodels.formula.api as smf

# Importing heart dataset from statsmodels in the form of pandas dataframe
data = sm.datasets.heart.load_pandas()

# Storing data
X = data.data

# Fit and summarize GEE model with 'survival' as dependent variable and 'censors and age' as independent variables
# Instantiate a poisson family model
model = smf.gee("survival ~ age + censors ", "survival", X, family = sm.families.Poisson())
model = model.fit()

# Model summary
model.summary()

Output - 
GEE Regression Results
Dep. Variable:	survival	No. Observations:	69
Model:	GEE	No. clusters:	64
Method:	Generalized	Min. cluster size:	1
Estimating Equations	Max. cluster size:	3
Family:	Poisson	Mean cluster size:	1.1
Dependence structure:	Independence	Num. iterations:	2
Date:	Mon, 22 Nov 2021	Scale:	1.000
Covariance type:	robust	Time:	17:44:01
coef	std err	z	P>|z|	[0.025	0.975]
Intercept	6.2545	0.615	10.173	0.000	5.049	7.459
age	0.0058	0.013	0.439	0.661	-0.020	0.032
censors	-1.1298	0.271	-4.176	0.000	-1.660	-0.600
Skew:	1.1451	Kurtosis:	0.6999
Centered skew:	-3.3717	Centered kurtosis:	31.4778

In this way, we can use Generalized Estimating Equations in the StatsModel library.

What Users are saying..

profile image

Ray han

Tech Leader | Stanford / Yale University
linkedin profile url

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Relevant Projects

Time Series Project to Build a Multiple Linear Regression Model
Learn to build a Multiple linear regression model in Python on Time Series Data

Learn to Build Generative Models Using PyTorch Autoencoders
In this deep learning project, you will learn how to build a Generative Model using Autoencoders in PyTorch

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

Model Deployment on GCP using Streamlit for Resume Parsing
Perform model deployment on GCP for resume parsing model using Streamlit App.

Build CI/CD Pipeline for Machine Learning Projects using Jenkins
In this project, you will learn how to create a CI/CD pipeline for a search engine application using Jenkins.

Recommender System Machine Learning Project for Beginners-4
Collaborative Filtering Recommender System Project - Comparison of different model based and memory based methods to build recommendation system using collaborative filtering.

Build OCR from Scratch Python using YOLO and Tesseract
In this deep learning project, you will learn how to build your custom OCR (optical character recognition) from scratch by using Google Tesseract and YOLO to read the text from any images.

Multilabel Classification Project for Predicting Shipment Modes
Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

Build a Text Generator Model using Amazon SageMaker
In this Deep Learning Project, you will train a Text Generator Model on Amazon Reviews Dataset using LSTM Algorithm in PyTorch and deploy it on Amazon SageMaker.