In this PySpark project, you will simulate a complex real-world data pipeline based on messaging. This project is deployed using the following tech stack - NiFi, PySpark, Hive, HDFS, Kafka, Airflow, Tableau and AWS QuickSight.
In this R data science project, we will explore wine dataset to assess red wine quality. The objective of this data science project is to explore which chemical properties will influence the quality of red wines.
Use the Hadoop ecosystem to glean valuable insights from the Yelp dataset. You will be analyzing the different patterns that can be found in the Yelp data set, to come up with various approaches in solving a business problem.