Credit Card Fraud Detection as a Classification Problem

Credit Card Fraud Detection as a Classification Problem

In this data science project, we will predict the credit card fraud in the transactional dataset using some of the predictive models.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Arvind Sodhi

VP - Data Architect, CDO at Deutsche Bank

I have extensive experience in data management and data processing. Over the past few years I saw the data management technology transition into the Big Data ecosystem and I needed to follow suit. I... Read More

Dhiraj Tandon

Solution Architect-Cyber Security at ColorTokens

My Interaction was very short but left a positive impression. I enrolled and asked for a refund since I could not find the time. What happened next: They initiated Refund immediately. Their... Read More

What will you learn

Exploring the dataset
Perform EDA using Univariate, Bivariate and Multivariate analysis
Visualizing and understanding the feature plots and correlation plots
Create pairwise plots for each attribute
Create density plots for each attribute
Learn to handle imbalanced data using oversampling, undersampling and mixed sampling
Learn to remove redundant features
Rank features using LVQ model (Learning Vector Quantization)
Select features using RFE method (Recursive Feature Elimination)
Learn to preprocessing using LDA (Linear Discriminant Analysis)
Apply Linear Algorithms like Logistic Regression model
Apply Non Linear Algorithms like SVM (Support Vector Machine), KNN (K Nearest Neighbour) and Naive Bayes
Apply Non Linear Algorithms like CART (Classification and Regression Trees)
Apply Ensemble Algorithms like RandomForest, Bagging CART, Gradient Boosting model
Perform GLMNet Regression analysis
Apply Neural Network model
Compare results of different models
Select the best model
Visualize results using box and whisker plots

Project Description

The Credit Card Fraud detection Dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset present transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.

The dataset has been collected and analyzed during a research collaboration of Worldline and the Machine Learning Group ( of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on and

Similar Projects

In this data science project, we will look at few examples where we can apply various time series forecasting techniques.

Machine Learning Project - Work with KKBOX's Music Recommendation System dataset to build the best music recommendation engine.

The goal of this machine learning project is to predict which products existing customers will use next month based on their past behaviour and that of similar customers.

Curriculum For This Mini Project

Loading the dataset
Understanding the Data
Exploratory Data Analysis (EDA)
Cross Validation
Business Aspect