Data Science Project-TalkingData AdTracking Fraud Detection

Data Science Project-TalkingData AdTracking Fraud Detection

Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Hiren Ahir

Microsoft Azure SQL Sever Developer, BI Developer

I'm a Graduate student and came into the job market and found a university degree wasn't sufficient to get a good paying job. I aimed at hottest technology in the market Big Data but the word BigData... Read More

Nathan Elbert

Senior Data Scientist at Tiger Analytics

This was great. The use of Jupyter was great. Prior to learning Python I was a self taught SQL user with advanced skills. I hold a Bachelors in Finance and have 5 years of business experience.. I... Read More

What will you learn

Understanding the problem statement
Importing a training dataset and testing
Installing necessary libraries and understanding its use
Performing basic EDA and checking for null values
Timestamping the necessary columns
Checking for unique values and data types
Checking for the relationship between different variables
Visualizing the distribution through ggplot
Understanding Decision tree, random forest, logistic regression,SVM , boosting ,bagging models ,cart ,and neural network
Defining the evaluation metrics
Prediction using all the features by applying Logistic Regression
Dividing the dataset into train and test dataset
Data Balancing using Smote
Applying Cross-Validation to avoid overfitting
Using "varImp" function in R to get the best features fo the model
Applying ensemble method Random Forest model
Applying bagging model Decision Tree
Applying linear model Logistic Regression
Using the confusion matrix to visualize the predictions
Selecting the final model and making predictions on the test dataset

Project Description

Fraud risk is everywhere, but for companies that advertise online, click fraud can happen at an overwhelming volume, resulting in misleading click data and wasted money. Ad channels can drive up costs by simply clicking on the ad at a large scale. With over 1 billion smart mobile devices in active use every month, China is the largest mobile market in the world and therefore suffers from huge volumes of fraudulent traffic. 

In this machine learning project, you will build a machine learning model to determine whether a click is fraud or not.

Similar Projects

In this data science project, you will learn to predict churn on a built-in dataset using Ensemble Methods in R.

Learn to classify the sentiment of sentences from the Rotten Tomatoes dataset. You will be asked to label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive.

There are different time series forecasting methods to forecast stock price, demand etc. In this machine learning project, you will learn to determine which forecasting method to be used when and how to apply with time series forecasting example.

Curriculum For This Mini Project

Problem Statement
Data Set
Install Libraries
Import Data Set
Data Set Overview
Next Steps
Recap in Rstudio
Missing Data
Analyse Click Time variable
Analyse Features
Convert variables to correct Data types
Exploratory Data Analysis
Model Creation using all Features
Selecting Important Features
Tune length Parameter
Model Creation using selected Features
Split Data into Training and Testing
Data Balancing using SMOTE
Model Creation - Decision Tree
Model Creation - Random Forest