Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
Data is everywhere and constantly being generated around us. Using Big data tools, it is possible to ingest, process and make decisions based on data at high speed.
This big data project for beginners demonstrates how to use Apache Flume to ingest trading data from a source. While the default data flow is to archive all data to HDFS, Flume is also configured to channel some preconfigured symbols or trading pairs of interest to another processing server using Kafka. All the processed instructions are stored in a relational database (MySQL).
We will use following tools in this flume kafka project:
The goal of this apache kafka project is to process log entries from applications in real-time using Kafka for the streaming architecture in a microservice sense.
In this project, we are going to analyze streaming logfile dataset by integrating Kafka and Kylin.
In this project, we will show how to build an ETL pipeline on streaming datasets using Kafka.