Data Warehouse Design for E-commerce Environments

In this hive project, you will design a data warehouse for e-commerce environments.
Videos
Each project comes with 2-5 hours of micro-videos explaining the solution.
Code & Dataset
Get access to 50+ solved projects with iPython notebooks and datasets.
Project Experience
Add project experience to your Linkedin/Github profiles.

What will you learn

  • Roles in a data engineering project and their functions
  • Analysing a data problem
  • Designing a big data warehouse
  • Data processing using Spark
  • Data querying using Hive/Impala

Project Description

The entire goal of investing in a data infrastructure is to improve the edge of business as well as the company's bottom line.

In this big data project, we are going to be designing a data warehouse for a retail shop. The design and implementation, however, we focus on answering some specific questions that are related to price optimization and inventory allocation. The two questions we will be looking to answer in this hive project include:

  1. Were the higher priced items selling in certain markets?
  2. should inventory be re-allocated or price optimized based upon geography?

We will recognize the entire purpose of answer these questions with data is to boost overall bottom line for the business while improving the experience for the shoppers.

Curriculum For This Mini Project

 
  Importance of Data Engineering
04m
  Overview of the E-commerce Business Problem
16m
  Solution Design
13m
  Data Exploration
19m
  Create Views
15m
  Migrate Data or ETL with Apache Sqoop
17m
  Executing and Troubleshooting a Sqoop Job
12m
  Create Views for EDA (Exploratory Data Analysis)
09m
  Perform EDA (Exploratory Data Analysis)
07m
  Analyse data with Spark
06m
  Perform EDA and Troubleshooting
19m
  Data Processing with Spark Scala
05m
  Scala function to create objects
17m
  Building a Map function
04m
  Key Value Pairs
03m
  Write to HDFS
05m
  Troubleshooting Spark script
14m
  Business example - Market segmentation
02m
  Oozie
05m
  Build and troubleshoot an Oozie script
19m
  Oozie dryrun
02m
  Oozie Coordinator
13m
  Troubleshooting Oozie configuration
05m