Solved end-to-end Data Science and Big Data projects

Get ready to use coding projects for solving real-world business problems


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Best Sellers in “Apache Hadoop Projects”

Big Data Project Hadoop Project-Analysis of Yelp Dataset using Hadoop Hive
The goal of this hadoop project is to apply some data engineering principles to Yelp Dataset in the areas of processing, storage, and retrieval.
Big Data Project Yelp Data Processing Using Spark And Hive Part 1
In this big data project, we will continue from a previous hive project "Data engineering on Yelp Datasets using Hadoop tools" and do the entire data processing using spark.
Big Data Project Data processing with Spark SQL
In this Apache Spark SQL project, we will go through provisioning data for retrieval using Spark SQL.
Big Data Project Implementing Slow Changing Dimensions in a Data Warehouse using Hive and Spark
Hive Project- Understand the various types of SCDs and implement these slowly changing dimesnsion in Hadoop Hive and Spark.
Big Data Project Online Hadoop Projects -Solving small file problem in Hadoop
In this hadoop project, we are going to be continuing the series on data engineering by discussing and implementing various ways to solve the hadoop small file problem.
Big Data Project Hive Project - Visualising Website Clickstream Data with Apache Hadoop
Analyze clickstream data of a website using Hadoop Hive to increase sales by optimizing every aspect of the customer experience on the website from the first mouse click to the last.
Big Data Project Real-Time Log Processing using Spark Streaming Architecture
In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security
Big Data Project Spark Project -Real-time data collection and Spark Streaming Aggregation
In this big data project, we will embark on real-time data collection and aggregation from a simulated real-time system using Spark Streaming.
Big Data Project Movielens dataset analysis using Hive for Movie Recommendations
In this hadoop hive project, you will work on Hive and HQL to analyze movie ratings using MovieLens dataset for better movie recommendation.
Big Data Project Design a Hadoop Architecture
Learn to design Hadoop Architecture and understand how to store data using data acquisition tools in Hadoop.
Big Data Project Create a data pipeline based on messaging using Spark and Hive
In this spark project, we will simulate a simple real-world batch data pipeline based on messaging using Spark and Hive.
Big Data Project Data Warehouse Design for E-commerce Environments
In this hive project, you will design a data warehouse for e-commerce environments.
Big Data Project Hadoop Project for Beginners-SQL Analytics with Hive
In this hadoop project, learn about the features in Hive that allow us to perform analytical queries over large datasets.
Big Data Project Finding Unique URL's using Hadoop Hive
Hive Project -Learn to write a Hive program to find the first unique URL, given 'n' number of URL's.
Big Data Project Airline Dataset Analysis using Hadoop, Hive, Pig and Impala
Hadoop Project- Perform basic big data analysis on airline dataset using big data tools -Pig, Hive and Impala.
Big Data Project Data Mining Project on Yelp Dataset using Hadoop Hive
Use the Hadoop ecosystem to glean valuable insights from the Yelp dataset. You will be analyzing the different patterns that can be found in the Yelp data set, to come up with various approaches in solving a business problem.
Big Data Project Tough engineering choices with large datasets in Hive Part - 1
Explore hive usage efficiently in this hadoop hive project using various file formats such as JSON, CSV, ORC, AVRO and compare their relative performances
Big Data Project Yelp Data Processing using Spark and Hive Part 2
In this spark project, we will continue building the data warehouse from the previous project Yelp Data Processing Using Spark And Hive Part 1 and will do further data processing to develop diverse data products.
Big Data Project Process a Million Song Dataset to Predict Song Preferences
In this big data project, we will discover songs for those artists that are associated with the different cultures across the globe.
Big Data Project Building a Data Warehouse using Spark on Hive
In this hive project , we will build a Hive data warehouse from a raw dataset stored in HDFS and present the data in a relational structure so that querying the data will be natural.
Big Data Project Data Analysis and Visualisation using Spark and Zeppelin
In this big data project, we will talk about Apache Zeppelin. We will write code, write notes, build charts and share all in one single data analytics environment using Hive, Spark and Pig.
Big Data Project Using Apache Hive for Real-Time Queries and Analytics
Learn to write a Hadoop Hive Program for real-time querying.
Big Data Project Design a Network Crawler by Mining Github Social Profiles
In this big data project, we will look at how to mine and make sense of connections in a simple way by building a Spark GraphX Algorithm and a Network Crawler.
Big Data Project Implementing OLAP  on Hadoop using Apache Kylin
In this big data project, we will be performing an OLAP cube design using AdventureWorks database. The deliverable for this session will be to design a cube, build and implement it using Kylin, query the cube and even connect familiar tools (like Excel) with our new cube.
Big Data Project IoT Project-Learn to design an IoT Ready Infrastructure 
The goal of this IoT project is to build an argument for generalized streaming architecture for reactive data ingestion based on a microservice architecture. 
Big Data Project NoSQL Project on Yelp Dataset using HBase and MongoDB
In this NoSQL project, we will use two NoSQL databases(HBase and MongoDB) to store Yelp business attributes and learn how to retrieve this data for processing or query.
Big Data Project Microsoft Cortana Intelligence Suite Analytics Workshop
In this big data project, we'll work through a real-world scenario using the Cortana Intelligence Suite tools, including the Microsoft Azure Portal, PowerShell, and Visual Studio.
Big Data Project Integrating Spark and NoSQL Database for Data Analysis
In this project, we will look at two database platforms - MongoDB and Cassandra and look at the philosophical difference in how these databases work and perform analytical queries.
Big Data Project Big Data Project on Processing Unstructured Data using Spark
In this project, we will evaluate and demonstrate how to handle unstructured data using Spark.
Big Data Project Hive Project- Denormalize JSON Data and analyse it with HIVE Scripts
In this hive project, you will work on denormalizing the JSON data and create HIVE scripts with ORC file format.
Big Data Project SQL vs NoSQL-Choosing the right DBMS for your Project
In this project, we will walk through all the various classes of NoSQL database and try to establish where they are the best fit.
Big Data Project Hadoop Project - Choosing the best SQL-on-Hadoop Engine
In this project, we will take a look at three different SQL-on-Hadoop engines - Hive, Phoenix, Impala and Presto.
Big Data Project Streaming ETL in Kafka with KSQL using NYC TLC Data
In this project, we will show how to build an ETL pipeline on streaming datasets using Kafka.

Hadoop Projects

Professionals and students who complete learning Hadoop from DeZyre often ask our industry experts –

“How and where can I get projects in Hadoop, Hive, Pig or HBase to get more exposure to the big data tools and technologies?”

DeZyre’s mini projects on Hadoop are designed to provide big data beginners and experienced professionals better understanding of complex Hadoop architecture and its components with practice big data sets across diverse business domains -Retail, Travel, Banking, Finance, Media and more.

Why you should enroll for DeZyre’s Big Data Hadoop projects?

Key Learnings from DeZyre’s Hadoop Projects

What are the best Hadoop projects for beginners ?

For big data beginners who want to get started learning with the basics of Hadoop ecosystem, DeZyre has interesting Hadoop project ideas for beginners that will help them learn Hadoop through 10 projects -

What will you get when you enroll for DeZyre’s Hadoop projects?