Hadoop Project for Beginners-SQL Analytics with Hive

In this hadoop project, learn about the features in Hive that allow us to perform analytical queries over large datasets.

Users who bought this project also bought

What will you learn

  • Hive high-level recap
  • Data ingestion/transformation using Sqoop, Spark and Hive
  • File formats and query performance
  • Writing aggregate queries using UDAFs.
  • Aggregation using widowing functions.
  • Query optimizations in Hive

What will you get

  • Access to recording of the complete project
  • Access to all material related to project like data files, solution files etc.

Project Description

In this hive project, we want to take a deeper dive into some analytical features in Hive. Using SQL is still very dominant and will remain so for the nearest features. Most big data tools have been adapted to allow users interact with them using the familiar SQL language. This is because of years of knowledge and skill that has gone into training, acceptance, tooling, standards development and re-engineering. So in many cases, using these cool features of SQL to access data solves a lot of analytical questions without ever needing us to resort to machine learning, BI or data mining.

In this big data project, we want to look at these features in Hive that allows us to perform analytical queries over large datasets.

We will be using the adventure works dataset in a MySQL dataset. Therefore, there will be a need to ingest and transform the data before we proceed to analytics.

Curriculum For This Mini Project

 
  Overview
11m
  SerDes
03m
  Cloning the dataset
02m
  Understanding the dataset
05m
  Load the data
06m
  Query the data
11m
  Create a Sqoop job
19m
  Executing the Sqoop job
15m
  Why is append used ?
04m
  Build hive tables on top of the data
11m
  Troubleshooting hive table
03m
  Using Parquet and xpath
18m
  Select statement
12m
  Use case based aggregations
09m
  Q&A - the problem statement
14m
  Q&A - Hive versus myql database
08m
  Enhancing aggregate functions
25m
  Grouping sets
08m
  Rollup versus Cube
16m
  Windowing analytic functions
18m
  Properties of windowing analytic functions
24m
  Solving an example - finding %
16m