In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.


What will you learn

Understanding the problem statement
Importing the dataset
Initializing necessary libraries and understanding its use
Performing basic EDA on the dataset
Visualizing outliers using boxplot and whiskers plot
Fixing the outliers using IQR method
Creating new features form existing features/Feature engineering
Preparing the dataset for fitting it into a model
Applying Logistics Regression
Understanding "Sensitivity" and "Specificity"
ROC and AUC curve as evaluation metrics
Sampling technique called as Bagging and boosting technique
Applying Gradient Boosting and Random Forest
Making final predictions

Project Description

Banks often depend on credit score prediction models to approve or deny a loan request. A good prediction model is necessary for a bank so that they can provide maximum credit without exceeding the risk threshold. This data science project uses credit score dataset which has fairly large volume of data (250K). The predictive models will be build following various approaches - random forests, graident boosting and logistic regression. At the end of the project you will build a predictive model that will automatically score each applicant with a credit score which is human readable and easy to interpret.

