Perform the Survival and Duration Analysis in StatsModels library

With this recipe, you can perform the Survival and Duration Analysis in StatsModels

Recipe Objective - How to perform the Survival and Duration Analysis in the StatsModels library?

The statsmodels.duration implements some standard methods for manipulating censored data. These methods are most commonly used when the data consists of a period from the original time to when the event of interest occurred. A typical example is a medical study where the origin is the time when a person was diagnosed with the disease, and the possibility of interest is death (or disease progression, recovery, etc.)

Sentiment Analysis Project on eCommerce Product Reviews with Source Code 

Currently, only legal censorship is dealt with. The correct censoring occurs when you know that the event happened after a specific time t, but you do not know the exact time of the event.

For more related projects -

https://www.projectpro.io/projects/data-science-projects/deep-learning-projects
https://www.projectpro.io/projects/data-science-projects/neural-network-projects

Survival function estimation and inference

The statsmodels.api.SurvfuncRight class can be used to estimate survival functions using data that may be censored to the right. SurvfuncRight implements several inference methods, including confidence intervals for survival quantiles, pointwise simultaneous confidence intervals for survival functions, and plotting methods. The duration.survdiff function provides a test procedure for comparing survival distributions.

Here we are creating a SurvfuncRight object using the data from the Moore study available from the R dataset repository. Adjust the survival distribution for 'low' fcategory subjects only.

Example:

# Importing libraries
import statsmodels.api as sm

# Importing moore dataset from carData package
X = sm.datasets.get_rdataset("Moore", "carData").data

# Filtering data of low fcategory
X = X[X['fcategory'] == "low"]

# Creating SurvfuncRight model
model = sm.SurvfuncRight(X["conformity"], X["fscore"])

# Model Summary
model.summary()

Output - 
	Surv prob	Surv prob SE	num at risk	num events
Time				
6	0.0	0.0	15	46.0
7	0.0	0.0	13	48.0
8	0.0	0.0	11	54.0
10	0.0	0.0	9	36.0
12	0.0	0.0	8	93.0
13	0.0	0.0	5	31.0
15	0.0	0.0	4	30.0
16	0.0	0.0	3	35.0
21	0.0	0.0	2	16.0
23	0.0	0.0	1	15.0

In this way, we can perform the Survival and Duration Analysis in the StatsModels library.

What Users are saying..

profile image

Ameeruddin Mohammed

ETL (Abintio) developer at IBM
linkedin profile url

I come from a background in Marketing and Analytics and when I developed an interest in Machine Learning algorithms, I did multiple in-class courses from reputed institutions though I got good... Read More

Relevant Projects

Time Series Python Project using Greykite and Neural Prophet
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.

Build a Graph Based Recommendation System in Python -Part 1
Python Recommender Systems Project - Learn to build a graph based recommendation system in eCommerce to recommend products.

Ola Bike Rides Request Demand Forecast
Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

Build a Text Classification Model with Attention Mechanism NLP
In this NLP Project, you will learn to build a multi class text classification model with attention mechanism.

Text Classification with Transformers-RoBERTa and XLNet Model
In this machine learning project, you will learn how to load, fine tune and evaluate various transformer models for text classification tasks.

Hands-On Approach to Causal Inference in Machine Learning
In this Machine Learning Project, you will learn to implement various causal inference techniques in Python to determine, how effective the sprinkler is in making the grass wet.

Build a Logistic Regression Model in Python from Scratch
Regression project to implement logistic regression in python from scratch on streaming app data.

CycleGAN Implementation for Image-To-Image Translation
In this GAN Deep Learning Project, you will learn how to build an image to image translation model in PyTorch with Cycle GAN.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Build an End-to-End AWS SageMaker Classification Model
MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.