Loan Eligibility Prediction in Python using

Loan Eligibility Prediction in Python using

In this loan prediction project you will build predictive models in Python using to predict if an applicant is able to repay the loan or not.
explanation image


Each project comes with 2-5 hours of micro-videos explaining the solution.

ipython image

Code & Dataset

Get access to 102+ solved projects with iPython notebooks and datasets.

project experience

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

What will you learn

What is H2O and why is it used
Initializing an H2O cluster for using all cores of CPU and RAM
Importing the dataset from amazon AWS
Performing basic EDA on the dataset
Visualizing the dataset using Quantiles and Histograms
Constructing test and train sets using sampling
Defining the data for the model and displaying the results
Applying GLM model for making predictions
Defining mean_squared_error for evaluation metrics
View model information: training statistics, performance, important variables
Defining the parameters for a deep leaning Neural Networks
The first part of the data, without labels for unsupervised learning
The second part of the data, with labels for supervised learning
Converting train dataset with autoencoder model to lower-dimensional space
Training the DRF on reduced feature space
Making the final predictions

Project Description

Business Objective

When a customer applies for a loan at our company, we use statistical models to determine whether or not to grant the loan based on the likelihood of the loan being repaid. The factors involved in determining this likelihood are complex, and extensive statistical analysis and modelling are required to predict the outcome for each individual case.


You must implement a model that predicts if a loan should be granted to an individual based on the data provided


Tech Stack

  • Language : Python
  • Libraries : Scikit-learn, H2O, pandas, numpy, flask, Seaborn, Matplotlib
  • Containerization : Docker

Dataset Description

The dataset used is an anonymized synthetic data that was generated specifically for use in this project. The data is designed to exhibit similar characteristics to genuine loan data.

In this dataset, you must explore and cleanse a dataset consisting of over 1,00,000 loan records to determine the best way to predict whether a loan applicant should be granted a loan or not.

New Projects

Curriculum For This Mini Project

Business Problem and Installing Python
Libraries used in solving the problem
Exploratory Data Analysis - Part 1
Exploratory Data Analysis (with simple imputation) - Part 2
Exploratory Data Analysis (with simple imputation and binarization) - Part 3
Exploratory Data Analysis (with simple imputation and categorical encoding) - Part 4
Advanced imputation using KNN imputation
Data Scaling and Normalization
H2o setup and data train test split
Introduction to gradient boosting
Model Building using Grid search
Intoduction to XG Boost
Xgboost and deep learning implementation
Model evaluation and selecting best model
Code walkthrough
Flask Deployment
Model Containerization

Latest Blogs