Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 102+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More
The project orientation is very much unique and it helps to understand the real time scenarios most of the industries are dealing with. And there is no limit, one can go through as many projects... Read More
I have extensive experience in data management and data processing. Over the past few years I saw the data management technology transition into the Big Data ecosystem and I needed to follow suit. I... Read More
This was great. The use of Jupyter was great. Prior to learning Python I was a self taught SQL user with advanced skills. I hold a Bachelors in Finance and have 5 years of business experience.. I... Read More
I have had a very positive experience. The platform is very rich in resources, and the expert was thoroughly knowledgeable on the subject matter - real world hands-on experience. I wish I had this... Read More
I came to the platform with no experience and now I am knowledgeable in Machine Learning with Python. No easy thing I must say, the sessions are challenging and go to the depths. I looked at graduate... Read More
What is anomaly detection?
Anomaly detection (aka outlier analysis) is a step in data mining that identifies data points, events, and/or observations that deviate from a dataset’s normal behavior. Anomalous data can indicate critical incidents, such as a technical glitch, or potential opportunities, for instance, a change in consumer behavior.
Applications of Anomaly detection
Banking, Financial Services, and Insurance (BFSI) – In the banking sector, some of the use cases for anomaly detection are to flag abnormally high transactions, fraudulent activity, and phishing attacks.
Retail – In Retail, anomaly detection is used for processing large volumes of financial transactions to identify fraudulent behaviors, such as identity theft and fraudulent credit card usage.
Manufacturing – In Manufacturing, anomaly detection can be used in several important ways, such as identifying machines and tools that are underperforming, which can take months to find without anomaly detection technology.
IT and Telecom – In IT and Telecommunications, anomaly detection is increasingly valuable to detect and act on personal threats to users, financial threats to service providers, or other types of unexpected threats.
Defense and Government – In the Defence and Government setting, anomaly detection is best used for identifying excessive and fraudulent government spending, budgeting, and audits. This can save governments an immense amount of money.
Healthcare – In Health Care, anomaly detection is used for its application in a crucial management task that can improve the quality of the health services and avoid loss of huge amounts of money. In terms of identifying fraudulent claims from hospitals and on the side of the insurance providers.
Language used: R
Machine Learning interface: H2O
Other packages used: caret, e1071, ROCR, and many more
In this project, we will be using a credit card fraud dataset that represents fraudulent and legal transactions over a certain period. The data is available in a .csv format. In the dataset, we can see that most of the column names (V1 to V28) are not mentioned explicitly. This is because PCA (Principal Component Analysis) transformation has been performed on the original dataset to maintain the confidentiality of the data. Apart from these variables, we have a few explicit variables as follows
Time - Difference in seconds between each transaction and its previous transaction
Amount - Transaction Amount
0 - Non-fraudulent transaction
1 - Fraudulent Transaction
Business context and objective
Translating into Data Science approach
What, why, where Anomaly Detection?
Why we are using a fraud dataset for this problem
Algorithms used to solve this problem
Data importing and Data Understanding
Creating time variable
Preparing data for modelling
Understanding neural networks and deep neural networks
Unsupervised Learning using h2o
Building model and Model Details
Evaluation parameters understanding
Evaluating based on Reconstructed MSE
Supervised Learning using h2o
Building and tuning supervised learning model using H2O
Supervised Learning using Pretrained model and evaluation
Try different thresholds to improve accuracy
Making production-ready code
In this spark project, you will use the real-world production logs from NASA Kennedy Space Center WWW server in Florida to perform scalable log analytics with Apache Spark, Python, and Kafka.
In this time series project, you will forecast Walmart sales over time using the powerful, fast, and flexible time series forecasting library Greykite that helps automate time series problems.
Hive Project -Learn to write a Hive program to find the first unique URL, given 'n' number of URL's.
Learn to design Hadoop Architecture and understand how to store data using data acquisition tools in Hadoop.
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.
Use the Zillow dataset to follow a test-driven approach and build a regression machine learning model to predict the price of the house based on other variables.
In this deep learning project, you will find similar images (lookalikes) using deep learning and locality sensitive hashing to find customers who are most likely to click on an ad.
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.