Predict Churn for a Telecom company using Logistic Regression

Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

What will you learn

  • Understanding the problem statement

  • Importing the dataset from AWS

  • Importing important libraries and understanding its use

  • Understanding Confusion Matrix and Statistics

  • Understanding relation between variables using Uni-varoate and Bi-variate analysis

  • Using summary function in R and interpreting the result

  • Using box-plot for finding outliers and fixing them

  • Using barplot for visualiztaion

  • Converting categorical into factor vectors

  • Defining evaluation matrics and understanding "Kappa"

  • Splitting the Dataset into Train and test for cross validation

  • Applying Logistic Regression for training

  • Using the ensembling method Decision Tree and C5.0 models

  • Applying boosting model GBM

  • Selecting the best model for hyperparameter tuning

  • Use Grid Search and Cross Folds Validation method for optimizing the model and preventing over-fitting

  • Plotting the results of the model for visualizition

  • Making final predictions using the model and saving the result

Project Description

Customer churn refers to a decision made by the customer about ending the business relationship. It is also referred to the loss of clients or customers. Customer loyalty and customer churn always add up to 100%. If a firm has a 60% loyalty rate, then their loss or churn rate of customers is 40%. As per 80/20 customer profitability rule, 20% of customers are generating 80% of revenue. So, it is very important to predict the users likely to churn from the business relationship and the factors affecting the customer decisions. Here we are going to show how logistic regression model using R can be used to identify the customer churn in the telecom dataset.

Similar Projects

Big Data Project Predict Census Income using Deep Learning Models
In this project, we are going to work on Deep Learning using H2O to predict Census income.
Big Data Project Deep Learning with Keras in R to Predict Customer Churn
In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.
Big Data Project Data Science Project-Movie Review Sentiment Analysis using R
Learn to classify the sentiment of sentences from the Rotten Tomatoes dataset. You will be asked to label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive.
Big Data Project Solving Multiple Classification use cases Using H2O
In this project, we are going to talk about H2O and functionality in terms of building Machine Learning models.

Curriculum For This Mini Project

  Churn 1
  Exploratory Data Analysis-Univariate-pairplot-4