Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
Initially, I was unaware of how this would cater to my career needs. But when I stumbled through the reviews given on the website. I went through many of them and found them all positive. I would... Read More
I have had a very positive experience. The platform is very rich in resources, and the expert was thoroughly knowledgeable on the subject matter - real world hands-on experience. I wish I had this... Read More
In our previous Spark Project-Real-Time Log Processing using Spark Streaming Architecture, we built on a previous topic of log processing by using the speed layer of the lambda architecture. We performed a real time processing of log entries from application using Spark Streaming, storing the final data in a hbase table.
In this kafka project, we will repeat the same objectives using another set of real time technologies. The idea is to compare both approaches of doing real time data processing which will soon become mainstream in various industries.
We will be using Kafka for the streaming architecture in a microservice sense.
The major highlight of this big data project will be students having to compare the spark streaming approach vs the Kafka-only approach. This is a great session for developers, analyst as much as architects.
Note: It is worthy of note that the Cloudera QuickStart VM does not have Kafka. We intend to work around that. So come prepare to do Kafka Installation in Cloudera quickstart vm.
Spark Project - Discuss real-time monitoring of taxis in a city. The real-time data streaming will be simulated using Flume. The ingestion will be done using Spark Streaming.
In this project, we will show how to build an ETL pipeline on streaming datasets using Kafka.
Use the dataset on aviation for analytics to simulate a complex real-world big data pipeline based on messaging with AWS Quicksight, Druid, NiFi, Kafka, and Hive.