Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction

Ensemble Machine Learning Project - All State Insurance Claims Severity Prediction

In this ensemble machine learning project, we will predict what kind of claims an insurance company will get. This is implemented in python using ensemble machine learning algorithms.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

What will you learn

Detailed business description and the problem being addressed through analytics
Data uploading using popular pandas python package
Dataset overview and how to analyze a sample of the dataset
Exploratory data analysis to understand the Allstate insurance claim dataset
Analyzing 5 point summary and studying data distribution for categorical variables
Handling missing values for categorical and continuous variables
Outlier treatment with visual techniques (Box-Plots)
Difference between Label/One-Hot-Encoder and which technique to use
Use of Pickle file format to store and load models
Feature selection and elimination using Correlation, Constant Variance and Chi-Square statistical tests
Understanding ensemble Machine Learning algorithms
Hyper-parameter tuning using Sklearn functions
Model selection using RMSE as the model evaluation metric
Model deployment creating FlaskAPI

Project Description

All State, a personal insurance company in the United States, is interested in leveraging data science to predict the severity and the cost of insurance claims post an unforeseen event.

This ensemble machine learning project will help you understand the best practices followed in approaching a data analytics problem through python language focusing on using data science packages. We will predict how severe insurance claims will be for All State. We accomplish this using ensemble machine learning algorithms.

Similar Projects

In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.

Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Curriculum For This Mini Project

Business Problem Overview
Dataset Overview
Exploratory Data Analysis
Data Cleaning Pre-processing
Handling Outliers
Dependent Variable Analysis - Introduction To Ml Algorithms
Feature Selection - Continuous Variables
Feature Selection - 2
Variable Encoding - One Hot Technique
Categorical Feature Selection - Chi Square Test
Building A Machine Learning Model - Random Forest - Hyper Parameter Tuning
Model Validation - GBM (Gradient Boosting Machine) Model
Model Prediction On Test Data
Model Deployment - API