Design a Hadoop Architecture

Design a Hadoop Architecture

Learn to design Hadoop Architecture and understand how to store data using data acquisition tools in Hadoop.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Mike Vogt

Information Architect at Bank of America

I have had a very positive experience. The platform is very rich in resources, and the expert was thoroughly knowledgeable on the subject matter - real world hands-on experience. I wish I had this... Read More

Swati Patra

Systems Advisor , IBM

I have 11 years of experience and work with IBM. My domain is Travel, Hospitality and Banking - both sectors process lots of data. The way the projects were set up and the mentors' explanation was... Read More

What will you learn

Learn to design Hadoop Architecture
Learn to use data acquisition tools in Hadoop to store data
Learn to write real-time queries in Apache Hive

Project Description

Assume that the web server creates a log file with timestamp and query. How will you design the Hadoop architecture (explaining how you will store the data) that can help you return top 15 queries made in the last 12 hours?

Similar Projects

This is in continuation of the previous Hive project "Tough engineering choices with large datasets in Hive Part - 1", where we will work on processing big data sets using Hive.

In this spark project, we will continue building the data warehouse from the previous project Yelp Data Processing Using Spark And Hive Part 1 and will do further data processing to develop diverse data products.

In this hadoop project, we are going to be continuing the series on data engineering by discussing and implementing various ways to solve the hadoop small file problem.

Curriculum For This Mini Project

Design a Hadoop Architecture