Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
In this Big Data project, a senior Big Data Architect will demonstrate how to implement a Big Data pipeline on AWS at scale. You will be using the Covid-19 dataset. This will be streamed real-time from an external API using NiFi. The complex json data will be parsed into csv format using NiFi and the result will be stored in HDFS.
Then this data will be sent to Kafka for data processing using PySpark. The processed data will then be consumed from Spark and stored in HDFS. Then a Hive external table is created on top of HDFS. Finally the cleaned, transformed data is stored in the data lake and deployed. Visualisation is then done using Tableau and AWS QuickSight.
In this big data project, we will look at how to mine and make sense of connections in a simple way by building a Spark GraphX Algorithm and a Network Crawler.
In this hive project, you will design a data warehouse for e-commerce environments.
In this project, we will show how to build an ETL pipeline on streaming datasets using Kafka.