1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com
data-engineering-on-yelp-datasets-using-hadoop-tools.jpg

Data engineering on Yelp Datasets using Hadoop tools

In this project, we will be applying some data engineering principles to the Yelp Dataset in the areas of processing, storage, and retrieval.
4.64.6

Users who bought this project also bought

What will you learn

  • Making decision on data storage and access
  • How to process and store variety of data format
  • Avoiding Hadoop's small file problems
  • Process and storing binary content
  • Provisioning access of data using hive/impala
  • Serving layer vs Batch layer (Neo4j vs HDFS)

What will you get

  • Access to recording of the complete project
  • Access to all material related to project like data files, solution files etc.

Prerequisites

Project Description

Data engineering is the science of acquiring, aggregating or collection, processing, and storage of data either in batch or in real-time as well as providing the variety of means of serving these data to other users which could include a data scientist. It involves software engineering practices on big data.

In this hackerday project, we will be applying some data engineering principles to the Yelp Dataset in the areas of processing, storage, and retrieval. We will not include data ingestion since we are already downloading the data from the yelp challenge website.

Instructors

 
Michael

Big Data & Enterprise Software Engineer

I am passionate about software development, databases, data analysis and the android platform. My native language is java but no one has stopped me so far from learning and using angular and node.js. Data and data analysis is thrilling and so are my experiences with SQL on Oracle, Microsoft SQL Server, Postgres and MyS see more...