Data Science Project-All State Insurance Claims Severity Prediction

Data science project in R to develop automated methods for predicting the cost and severity of insurance claims.

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

What will you learn

  • Understanding the problem statement

  • Importing the Train and Test dataset directly from the source

  • Performing basic EDA and checking for null values

  • Filling the null values using most suitable methods

  • Visualization using Density plot

  • Using correlation plot for understanding correlation between different features

  • Visualizing combined effect of different variables on the target

  • Using box and whiskers plot for visualizing and handling outliers

  • Autocorrelation, Normal distribution, Multicollinearity, and heteroscedasticity

  • Applying Linear model and plotting graphs for the residuals

  • Preparing the dataset for fitting in XGBOost model

  • Defining parameters for XGBoost model

  • Training the XGBoost models and calculating the accuracy

  • Making final predictions for the test dataset

  • Plotting Graphs for Train_loss versus Train_preds

Project Description

When you’ve been devastated by a serious car accident, your focus is on the things that matter the most: family, friends, and other loved ones. Pushing paper with your insurance agent is the last place you want your time or mental energy spent. This is why Allstate, a personal insurer in the United States, is continually seeking fresh ideas to improve their claims service for the over 16 million households they protect.

In this data science project, you will develop automated methods for predicting the cost, and severity, of claims.

Similar Projects

Big Data Project Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.
Big Data Project Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.
Big Data Project German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.
Big Data Project Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Curriculum For This Mini Project

 
  Import Data Files
01m
  Problem Statement
04m
  Data Overview
02m
  What Model to use?
03m
  Exploratory Data Analysis
12m
  Distribution Type
02m
  Shapiro Test
04m
  Transformations
06m
  Outliers
10m
  Removing Outliers - Capping Method
39m
  Dependent variable distribution
09m
  Recap
01m
  Modelling Techniques
02m
  Correlation Table
03m
  Zero Variance
04m
  Residuals having Normal Distribution
21m
  Multicolinearity
02m
  Multicolinearity - Stepwise AIC
13m
  Multicolinearity - Compute VIF
02m
  Heteroscedasticity
02m
  XGBoost Model
16m
  Conclusion
00m