One of the broadest use of Hadoop today is building data warehousing platform off a data lake. And in building a data warehouse, the traditions left us by Kimball and Inmon is still very much in play.
Why not every one of the legacy rules should be implemented as as-is in the big data platform, the issue of slow-changing dimensions is still a front-burner.
The slow changing dimension of warehouse dimension that is said to rarely change. However, when they change, there should be a systematic approach to capturing that change. Examples of SCDs are customer and products information.
In this hackerday, we will look at the various types of SCDs and how to implements SCDs in Hive and Spark.