In this project, we want to take a deeper dive into some analytical features in Hive. Using SQL is still very dominant and will remain so for the nearest features. Most big data tools have been adapted to allow users interact with them using the familiar SQL language. This is because of years of knowledge and skill that has gone into training, acceptance, tooling, standards development and re-engineering. So in many cases, using these cool features of SQL to access data solves a lot of analytical questions without ever needing us to resort to machine learning, BI or data mining.
In this project, we want to look at these features in Hive that allows us to perform analytical queries over large datasets.
We will be using the adventure works dataset in a MySQL dataset. Therefore, there will be a need to ingest and transform the data before we proceed to analytics.