Build a big data pipeline with AWS Quicksight, Druid, and Hive

Use the dataset on aviation for analytics to simulate a complex real-world big data pipeline based on messaging with AWS Quicksight, Druid, NiFi, Kafka, and Hive.
What will you learn

End-to-end implementation of Big data pipeline on AWS
Scalable, reliable, secure data architecture followed by top notch Big data leaders
Detailed explanation of V's in Big Data and data pipeline building and automation of the processes
Real time streaming data import from external API using NiFi
Build both Batch and streaming data pipeline on AWS from NiFi
Write the data into HDFS (batch) and Kafka(streaming ingestion) using NiFi
Ingest the data into Druid using HDFS(batch ingestion) as well as Kafka( real time)
Compare the performance of Druid or Hive
Discuss limitations and opportunities with Druid and Hive
Hive external table creation on top of HDFS data
Performing ETLs which are widely used in the industry on top of Hive data and storing into managed table
Visualising Hive data using AWS Quicksight to calculate some of the KPIs in Aviation data

Project Description

In this Big Data project, a senior Big Data Architect will demonstrate how to implement a Big Data pipeline on AWS at scale. You will be using the Aviation dataset. Analyse Aviation data using highly competitive technology big data stack such as NiFi, Kafka, HDFS ,Hive, Druid, AWS quicksight to derive metrics out of the existing data . Big data pipelines built on AWS to serve both batch and real time streaming ingestions of the data for various consumers according to their needs . This project is highly scalable and implemented on a very large scale organisation set up .

Curriculum For This Mini Project

Introduction to building pipeline using Druid Hive and Quicksight
Introduction to Big Data
Introduction to Big Data Pipeline
System Requirements
Data Architecture using Nifi Kafka Hive and Druid
Introduction to Apache Nifi
Apache Kafka vs Apache Flume
Apache Hive optimization techniques
Druid architecture and comparison with Hive and Presto
Exploration of Dataset
Extracting Data using Nifi into HDFS Kafka and MySQL
Configuring HDFS and Druid
Ingesting data from HDFS into Druid
Writing data from Nifi into Kafka
Consume data from Kafka to Druid
Compare query performance in Hive and Druid and MySQL
Compare query performance using MySQL
Connecting MySQL to AWS QuickSight for Visualization

