Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
Data is everywhere and constantly being generated around us. Using Big data tools, it is possible to ingest, process and make decisions based on data at high speed.
This big data project for beginners demonstrates how to use Apache Flume to ingest trading data from a source. While the default data flow is to archive all data to HDFS, Flume is also configured to channel some preconfigured symbols or trading pairs of interest to another processing server using Kafka. All the processed instructions are stored in a relational database (MySQL).
We will use following tools in this flume kafka project:
In this project, we will show how to build an ETL pipeline on streaming datasets using Kafka.
In this big data project, we will see how data ingestion and loading is done with Kafka connect APIs while transformation will be done with Kafka Streaming API.
Spark Project - Discuss real-time monitoring of taxis in a city. The real-time data streaming will be simulated using Flume. The ingestion will be done using Spark Streaming.