Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
One of the broadest use of Hadoop today is building data warehousing platform off a data lake. And in building a data warehouse, the traditions left us by Kimball and Inmon is still very much in play.
Why not every one of the legacy rules should be implemented as as-is in the big data platform, the issue of slow-changing dimensions is still a front-burner.
The slow changing dimension of warehouse dimension that is said to rarely change. However, when they change, there should be a systematic approach to capturing that change. Examples of SCDs are customer and products information.
In this hive project, we will look at the various types of SCDs and learn to implements SCDs in Hive and Spark.
In this project, we will take a look at three different SQL-on-Hadoop engines - Hive, Phoenix, Impala and Presto.
In this spark project, we will continue building the data warehouse from the previous project Yelp Data Processing Using Spark And Hive Part 1 and will do further data processing to develop diverse data products.
Hive Project -Learn to write a Hive program to find the first unique URL, given 'n' number of URL's.