The goal of this Spark project is to analyze business reviews from Yelp dataset and ingest the final output of data processing in Elastic Search.Also, use the visualisation tool in the ELK stack to visualize various kinds of ad-hoc reports from the data.
In this big data project, we will embark on real-time data collection and aggregation from a simulated real-time system using Spark Streaming.
This is in continuation of the previous Hive project "Tough engineering choices with large datasets in Hive Part - 1", where we will work on processing big data sets using Hive.
In this Databricks Azure project, you will use Spark & Parquet file formats to analyse the Yelp reviews dataset. As part of this you will deploy Azure data factory, data pipelines and visualise the analysis.
In this big data spark project, we will do Twitter sentiment analysis using spark streaming on the incoming streaming data.
Spark Project - Discuss real-time monitoring of taxis in a city. The real-time data streaming will be simulated using Flume. The ingestion will be done using Spark Streaming.
The goal of this apache kafka project is to process log entries from applications in real-time using Kafka for the streaming architecture in a microservice sense.
Learn to design Hadoop Architecture and understand how to store data using data acquisition tools in Hadoop.
Hadoop Project- Perform basic big data analysis on airline dataset using big data tools -Pig, Hive and Impala.
In this hive project, you will design a data warehouse for e-commerce environments.