All State Insurance Purchase Prediction Challenge Solution

All State Insurance Purchase Prediction Challenge Solution

Data Science Project-Predict the car insurance policy a customer buys after receiving a number of quotes.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Ray Han

Tech Leader | Stanford / Yale University

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Mike Vogt

Information Architect at Bank of America

I have had a very positive experience. The platform is very rich in resources, and the expert was thoroughly knowledgeable on the subject matter - real world hands-on experience. I wish I had this... Read More

What will you learn

Understanding the problem statement
Importing the Train and Test dataset
Importing useful libraries and understanding its significance
Difference between Series and a Dataframe
Data preparation and utilities
Groups all columns of data into combinations
Cantor Pariing
Checking for null values and performing basic EDA
Imputing null values using appropriate methods
What is "Pickling" and "Unpickling"
Defining class for parallel computation to reduce memory usage
Applying Random Forest Regressor as training model
Using majority voting method to determine the best solution
Hyper-parameter tuning using Cross Folds Validation to prevent overfitting
Making the final predictions and saving the predictions in CSV format

Project Description

The training and test sets in this data science project contain transaction history for customers that ended up purchasing a policy. For each customer_ID, you are given their quote history. In the training set you have the entire quote history, the last row of which contains the coverage options they purchased. In the test set, you have only a partial history of the quotes and do not have the purchased coverage options. These are truncated to certain lengths to simulate making predictions with less history (higher uncertainty) or more history (lower uncertainty).

For each customer_ID in the test set, you must predict the seven coverage options they end up purchasing.

Each customer has many shopping points, where a shopping point is defined by a customer with certain characteristics viewing a product and its associated cost at a particular time.

  • Some customer characteristics may change over time (e.g. as the customer changes or provides new information), and the cost depends on both the product and the customer characteristics.
  • A customer may represent a collection of people, as policies can cover more than one person.
  • A customer may purchase a product that was not viewed!

Similar Projects

In this project, we will automate the loan eligibility process (real-time) based on customer details while filling the online application form.

This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Curriculum For This Mini Project

05h 03m