Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
My Interaction was very short but left a positive impression. I enrolled and asked for a refund since I could not find the time. What happened next: They initiated Refund immediately. Their... Read More
I have worked for more than 15 years in Java and J2EE and have recently developed an interest in Big Data technologies and Machine learning due to a big need at my workspace. I was referred here by a... Read More
Spark 2 offers a huge but yet backward-compatible break from the Spark 1.x, not only in terms of high-level API but also in performance. And spark the module with the most significant new features is Spark SQL.
In this apache spark project, we will explore a number of this features in practice.
We will discuss using various dataset, the new unified spark API as well as the optimization features that makes Spark SQL the first way to explore in processing structured data.
However, there are times when it is inevitable to resort to Spark Core - RDD in Spark 2. We will explore that as well alongside the newest and cool structured streaming API that enables fault-tolerant stream processing engine built on the Spark SQL engine.
Hive Project- Understand the various types of SCDs and implement these slowly changing dimesnsion in Hadoop Hive and Spark.
In this big data project, we will continue from a previous hive project "Data engineering on Yelp Datasets using Hadoop tools" and do the entire data processing using spark.
This Elasticsearch example deploys the AWS ELK stack to analyse streaming event data. Tools used include Nifi, PySpark, Elasticsearch, Logstash and Kibana for visualisation.