Data Science Project-TalkingData AdTracking Fraud Detection

Data Science Project-TalkingData AdTracking Fraud Detection

Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Mohamed Yusef Ahmed

Software Developer at Taske

Recently I became interested in Hadoop as I think its a great platform for storing and analyzing large structured and unstructured data sets. The experts did a great job not only explaining the... Read More

Swati Patra

Systems Advisor , IBM

I have 11 years of experience and work with IBM. My domain is Travel, Hospitality and Banking - both sectors process lots of data. The way the projects were set up and the mentors' explanation was... Read More

What will you learn

Understanding the problem statement
Importing a training dataset and testing
Installing necessary libraries and understanding its use
Performing basic EDA and checking for null values
Timestamping the necessary columns
Checking for unique values and data types
Checking for the relationship between different variables
Visualizing the distribution through ggplot
Understanding Decision tree, random forest, logistic regression,SVM , boosting ,bagging models ,cart ,and neural network
Defining the evaluation metrics
Prediction using all the features by applying Logistic Regression
Dividing the dataset into train and test dataset
Data Balancing using Smote
Applying Cross-Validation to avoid overfitting
Using "varImp" function in R to get the best features fo the model
Applying ensemble method Random Forest model
Applying bagging model Decision Tree
Applying linear model Logistic Regression
Using the confusion matrix to visualize the predictions
Selecting the final model and making predictions on the test dataset

Project Description

Fraud risk is everywhere, but for companies that advertise online, click fraud can happen at an overwhelming volume, resulting in misleading click data and wasted money. Ad channels can drive up costs by simply clicking on the ad at a large scale. With over 1 billion smart mobile devices in active use every month, China is the largest mobile market in the world and therefore suffers from huge volumes of fraudulent traffic. 

In this machine learning project, you will build a machine learning model to determine whether a click is fraud or not.

Similar Projects

There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

In this deep learning project, we will predict customer churn using Artificial Neural Networks and learn how to model an ANN in R with the keras deep learning package.

Curriculum For This Mini Project

Problem Statement
Data Set
Install Libraries
Import Data Set
Data Set Overview
Next Steps
Recap in Rstudio
Missing Data
Analyse Click Time variable
Analyse Features
Convert variables to correct Data types
Exploratory Data Analysis
Model Creation using all Features
Selecting Important Features
Tune length Parameter
Model Creation using selected Features
Split Data into Training and Testing
Data Balancing using SMOTE
Model Creation - Decision Tree
Model Creation - Random Forest