Implementing Slow Changing Dimensions in a Data Warehouse using Hive and Spark

Hive Project- Understand the various types of SCDs and implement these slowly changing dimesnsion in Hadoop Hive and Spark.
Videos
Each project comes with 2-5 hours of micro-videos explaining the solution.
Code & Dataset
Get access to 50+ solved projects with iPython notebooks and datasets.
Project Experience
Add project experience to your Linkedin/Github profiles.

What will you learn

  • What is slow changing dimension (scd)
  • Types of slow changing dimension
  • Updates and transactions in Hive
  • Implementing SCD 2 & 3 in Hive
  • Implementing SCD 2 & 3 in Spark

Project Description

One of the broadest use of Hadoop today is building data warehousing platform off a data lake. And in building a data warehouse, the traditions left us by Kimball and Inmon is still very much in play.

Why not every one of the legacy rules should be implemented as as-is in the big data platform, the issue of slow-changing dimensions is still a front-burner.

The slow changing dimension of warehouse dimension that is said to rarely change. However, when they change, there should be a systematic approach to capturing that change. Examples of SCDs are customer and products information.

In this hive project, we will look at the various types of SCDs and learn to implements SCDs in Hive and Spark.

Curriculum For This Mini Project

 
  Project Overview
05m
  What is Datawarehousing?
03m
  Difference between Parquet and ORC
09m
  What is slow changing dimension?
07m
  Working with AdventureWorks Dataset to Understand SCD
04m
  Copy data using Scoop to hive
02m
  Denormalize Data
12m
  Example to understand SCD
06m
  Running the Scoop Job
10m
  Hive Querying to View the Data using Hue
09m
  Understanding the Changing Dimensions in Customer Demographics
06m
  Understanding Different Types of SCD's
18m
  Discussion on ELT vs ETL
05m
  Datawarehouse vs Data Lake
21m
  Data Lakes from a Data Architecture Perspective
06m
  Create Customer Table with SCD-Type 2
08m
  Create Customer Demo Table SCD-Type 4 and CreditCard Table with SCD Type 1
03m
  Transformations for SCD Type 1 on Credit Card Table
07m
  Hive Configurations to set SCD
00m
  Transformations for SCD Type 1 Continued
26m
  Transformations for SCD Type 4 with example
54m