Solved end-to-end Data Science projects

Solved
end-to-end
Data Science projects

Get ready to use coding projects for solving real-world business problems

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Apache Hadoop Projects

See All

In this hive project, you will design a data warehouse for e-commerce environments.

In this big data project, we'll work through a real-world scenario using the Cortana Intelligence Suite tools, including the Microsoft Azure Portal, PowerShell, and Visual Studio.

In this project, we will walk through all the various classes of NoSQL database and try to establish where they are the best fit.

Apache Hive Projects

See All

Hadoop Project- Perform basic big data analysis on airline dataset using big data tools -Pig, Hive and Impala.

This is in continuation of the previous Hive project "Tough engineering choices with large datasets in Hive Part - 1", where we will work on processing big data sets using Hive.

In this project, we will evaluate and demonstrate how to handle unstructured data using Spark.

Apache Hbase Projects

See All

The goal of this IoT project is to build an argument for generalized streaming architecture for reactive data ingestion based on a microservice architecture. 

In this hadoop project, learn about the features in Hive that allow us to perform analytical queries over large datasets.

In this big data project, we will be performing an OLAP cube design using AdventureWorks database. The deliverable for this session will be to design a cube, build and implement it using Kylin, query the cube and even connect familiar tools (like Excel) with our new cube.

Apache Pig Projects

See All

In this big data project, we will discover songs for those artists that are associated with the different cultures across the globe.

Hadoop Project- Perform basic big data analysis on airline dataset using big data tools -Pig, Hive and Impala.

Hadoop HDFS Projects

See All

In this hadoop project, we are going to be continuing the series on data engineering by discussing and implementing various ways to solve the hadoop small file problem.

Apache Oozie Projects

See All

In this hadoop project, you will be using a sample application log file from an application server to a demonstrated scaled-down server log processing pipeline.

In this spark streaming project, we are going to build the backend of a IT job ad website by streaming data from twitter for analysis in spark.

In this PySpark project, you will simulate a complex real-world data pipeline based on messaging. This project is deployed using the following tech stack - NiFi, PySpark, Hive, HDFS, Kafka, Airflow, Tableau and AWS QuickSight.

Apache Impala Projects

See All

Hadoop Project- Perform basic big data analysis on airline dataset using big data tools -Pig, Hive and Impala.

In this hive project, you will design a data warehouse for e-commerce environments.

This is in continuation of the previous Hive project "Tough engineering choices with large datasets in Hive Part - 1", where we will work on processing big data sets using Hive.

Apache Flume Projects

See All

Hadoop Projects for Beginners -Learn data ingestion from a source using Apache Flume and Kafka to make a real-time decision on incoming data.

In this hadoop project, we are going to be continuing the series on data engineering by discussing and implementing various ways to solve the hadoop small file problem.

In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security

Apache Sqoop Projects

See All

In this hadoop project, learn about the features in Hive that allow us to perform analytical queries over large datasets.

The goal of this Spark project is to analyze business reviews from Yelp dataset and ingest the final output of data processing in Elastic Search.Also, use the visualisation tool in the ELK stack to visualize various kinds of ad-hoc reports from the data.

Spark SQL Projects

See All

In this hive project , we will build a Hive data warehouse from a raw dataset stored in HDFS and present the data in a relational structure so that querying the data will be natural.

The goal of this spark project for students is to explore the features of Spark SQL in practice on the latest version of Spark i.e. Spark 2.0.

In this big data project, we will talk about Apache Zeppelin. We will write code, write notes, build charts and share all in one single data analytics environment using Hive, Spark and Pig.

Spark GraphX Projects

See All

The goal of this spark project is to analyse the level and strength of interactions across areas of coverage of a telecom provider between different areas in the city of Milan.

In this Neo4j project, you will do network analysis using a graph database to find patterns on how a social network affects business reviews and ratings.

In this big data project, we will look at how to mine and make sense of connections in a simple way by building a Spark GraphX Algorithm and a Network Crawler.

Spark Streaming Projects

See All

In this spark streaming project, we are going to build the backend of a IT job ad website by streaming data from twitter for analysis in spark.

The goal of this hadoop project is to apply some data engineering principles to Yelp Dataset in the areas of processing, storage, and retrieval.

In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security

Spark MLlib Projects

See All

In this big data spark project, we will do Twitter sentiment analysis using spark streaming on the incoming streaming data.

In this Hackerday, we will go through the basis of statistics and see how Spark enables us to perform statistical operations like descriptive and inferential statistics over the very large dataset.

Apache Spark Projects

See All

In this hive project, you will design a data warehouse for e-commerce environments.

In this PySpark project, you will simulate a complex real-world data pipeline based on messaging. This project is deployed using the following tech stack - NiFi, PySpark, Hive, HDFS, Kafka, Airflow, Tableau and AWS QuickSight.

In this project, we will evaluate and demonstrate how to handle unstructured data using Spark.

PySpark Projects

See All

PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

In this project, we are going to talk about insurance forecast by using regression techniques.

Apache Zepellin Projects

See All

In this big data project, we'll work with Apache Airflow and write scheduled workflow, which will download data from Wikipedia archives, upload to S3, process them in HIVE and finally analyze on Zeppelin Notebooks.

In this big data project, we will talk about Apache Zeppelin. We will write code, write notes, build charts and share all in one single data analytics environment using Hive, Spark and Pig.

Apache Kafka Projects

See All

In this hadoop project, you will be using a sample application log file from an application server to a demonstrated scaled-down server log processing pipeline.

In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security

In this project, we will show how to build an ETL pipeline on streaming datasets using Kafka.

Neo4j Projects

See All

In this big data project using Neo4j, we will be remodelling the movielens dataset in a graph structure and using that structures to answer questions in different ways.

In this Neo4j project, you will do network analysis using a graph database to find patterns on how a social network affects business reviews and ratings.

Redis Projects

See All

Spark Project - Discuss real-time monitoring of taxis in a city. The real-time data streaming will be simulated using Flume. The ingestion will be done using Spark Streaming.

Big Data Projects

Every year, people looking to begin their big data career run into a familiar conundrum - 

"How can I land a big data job with limited experience in this field?".

For an emerging field like big data, finding internships or full-time big data jobs requires you to showcase relevant achievements working with popular open source big data tools like, Hadoop, Spark, Kafka, Pig, Hive, and more. Big data and project-based learning are a perfect fit. The best way to get started is to begin working on diverse big data project titles under the mentorship of industry experts. Professionals will love working on these big data projects because it's like a secret. There is so much practical learning involved you don't realize it. DeZyre's big data projects are perfect for beginners, college students, engineering students, professionals wanting to make a career switch and anyone who wants to master big data skills with hands-on experience. 

 

 

Big Data Projects for Beginners

If you have graduate degree in analytics or relevant field from a top-tier college, it is easy for you to get a big data job. Employers believe that you will be able to add value to their business because of the prestige of the college that has awarded you the degree, and the reality that it is in a subject that is relevant to the kind of skills they are looking for. If you do not have an analytics degree from a top-tier college then you need to build that trust yourself that you have the big data skills that the employer is looking for. The best way to build trust with the hiring manager is to work on interesting big data project ideas and build a portfolio of multiple big data projects - Hadoop projects, spark projects, hive projects, Kafka projects, impala projects, and more. The more "real-world" the big data projects are, the more the hiring manager will trust that you will be an asset to their organization , and the greater are your chances of landing the big data job. The best thing about big data careers is that the work you do on building diverse big data projects often looks exactly similar to the work you will do once you are hired.

For IT professionals or anybody with basic big data knowledge, Dezyre's mini projects on big data will help them take responsibility in solving challenging data problems, and help gain expertise on the popular big data tools like Hadoop, Spark, Hive, Pig,

Big Data Projects for Engineering Students

The good news for people in search of big data projects for CSE students is that there are couple of websites that have big data projects with source code. If you google for search terms like "big data projects GitHub" or "big data projects Quora", you might find suggestions on multiple big data project titles, however, for students on the hunt for big data final year projects, titles and source code is not what all they need for learning. Students need industry expert guidance for deeper understanding and greater retention of knowledge so that they can apply what they know to new real-world big data problems. DeZyre has an excellent project-based learning platform where students will enjoy using a spectrum of big data tools under expert guidance.

Here are some popular big data project titles among the college students-


IT professionals and college students rate our big data projects as exceptional. Whether you are looking to upgrade your skills or you are looking to learn about the complete end-to-end implementation of various big data tools like Hadoop, spark, pig , hive, Kafka, and more, Dezyre's mini projects on big data are just what you want.

What will you get when you enroll for DeZyres Big Data projects?