Loan Eligibility Prediction in Python using H2O.ai

Loan Eligibility Prediction in Python using H2O.ai

In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.
explanation image

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

ipython image

Code & Dataset

Get access to 102+ solved projects with iPython notebooks and datasets.

project experience

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

What will you learn

What is H2O and why is it used
Initializing an H2O cluster for using all cores of CPU and RAM
Importing the dataset from amazon AWS
Performing basic EDA on the dataset
Visualizing the dataset using Quantiles and Histograms
Constructing test and train sets using sampling
Defining the data for the model and displaying the results
Applying GLM model for making predictions
Defining mean_squared_error for evaluation metrics
View model information: training statistics, performance, important variables
Defining the parameters for a deep leaning Neural Networks
The first part of the data, without labels for unsupervised learning
The second part of the data, with labels for supervised learning
Converting train dataset with autoencoder model to lower-dimensional space
Training the DRF on reduced feature space
Making the final predictions

Project Description

Business Objective

When a customer applies for a loan at our company, we use statistical models to determine whether or not to grant the loan based on the likelihood of the loan being repaid. The factors involved in determining this likelihood are complex, and extensive statistical analysis and modelling are required to predict the outcome for each individual case.

Aim

You must implement a model that predicts if a loan should be granted to an individual based on the data provided

 

Tech Stack

  • Language : Python
  • Libraries : Scikit-learn, H2O, pandas, numpy, flask, Seaborn, Matplotlib
  • Containerization : Docker

Dataset Description

The dataset used is an anonymized synthetic data that was generated specifically for use in this project. The data is designed to exhibit similar characteristics to genuine loan data.

In this dataset, you must explore and cleanse a dataset consisting of over 1,00,000 loan records to determine the best way to predict whether a loan applicant should be granted a loan or not.

New Projects

Curriculum For This Mini Project

Business Problem and Installing Python
04m
Libraries used in solving the problem
02m
Exploratory Data Analysis - Part 1
03m
Exploratory Data Analysis (with simple imputation) - Part 2
03m
Exploratory Data Analysis (with simple imputation and binarization) - Part 3
03m
Exploratory Data Analysis (with simple imputation and categorical encoding) - Part 4
04m
Advanced imputation using KNN imputation
02m
Data Scaling and Normalization
01m
H2o setup and data train test split
02m
Introduction to gradient boosting
03m
Model Building using Grid search
02m
Intoduction to XG Boost
03m
Xgboost and deep learning implementation
02m
Model evaluation and selecting best model
02m
Code walkthrough
04m
Flask Deployment
02m
Model Containerization
03m

Latest Blogs