Predict Churn for a Telecom company using Logistic Regression

Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

What will you learn

  • Understanding the problem statement

  • Importing the dataset from AWS

  • Importing important libraries and understanding its use

  • Understanding Confusion Matrix and Statistics

  • Understanding relation between variables using Uni-varoate and Bi-variate analysis

  • Using summary function in R and interpreting the result

  • Using box-plot for finding outliers and fixing them

  • Using barplot for visualiztaion

  • Converting categorical into factor vectors

  • Defining evaluation matrics and understanding "Kappa"

  • Splitting the Dataset into Train and test for cross validation

  • Applying Logistic Regression for training

  • Using the ensembling method Decision Tree and C5.0 models

  • Applying boosting model GBM

  • Selecting the best model for hyperparameter tuning

  • Use Grid Search and Cross Folds Validation method for optimizing the model and preventing over-fitting

  • Plotting the results of the model for visualizition

  • Making final predictions using the model and saving the result

Project Description

Customer churn refers to a decision made by the customer about ending the business relationship. It is also referred to the loss of clients or customers. Customer loyalty and customer churn always add up to 100%. If a firm has a 60% loyalty rate, then their loss or churn rate of customers is 40%. As per 80/20 customer profitability rule, 20% of customers are generating 80% of revenue. So, it is very important to predict the users likely to churn from the business relationship and the factors affecting the customer decisions. Here we are going to show how logistic regression model using R can be used to identify the customer churn in the telecom dataset.

Similar Projects

Big Data Project Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.
Big Data Project PUBG Finish Placement Data Science Project in R
In this project, we will try to predict how often players playing a video game called PUBG will win when they play by themselves.
Big Data Project Prediction or Classification Using Ensemble Methods in R
In this data science project, you will learn to predict churn on a built-in dataset using Ensemble Methods in R.
Big Data Project Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Curriculum For This Mini Project

  Churn 1
  Exploratory Data Analysis-Univariate-pairplot-4