Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
I have worked for more than 15 years in Java and J2EE and have recently developed an interest in Big Data technologies and Machine learning due to a big need at my workspace. I was referred here by a... Read More
Initially, I was unaware of how this would cater to my career needs. But when I stumbled through the reviews given on the website. I went through many of them and found them all positive. I would... Read More
We have come to learn that Hadoop's distributed file system was engineered to favor fewer larger files over many small files. However, we mostly would not have control over how data come. Many data ingestion to data infrastructures come in small bits and whether we are implementing a data lake on HDFS or not, we will have to deal with this data inputs.
In this online hadoop project, we are going to be continuing the series on data engineering by discussing and implementing various ways to resolve the small file problem in hadoop.
We will start by defining what it means, how inevitable this situation could arise, how to identify bottlenecks in a hadoop cluster owing to the small file problem and varieties of ways to solve them.
In this spark streaming project, we are going to build the backend of a IT job ad website by streaming data from twitter for analysis in spark.
In this hive project, you will design a data warehouse for e-commerce environments.
In this project, we will walk through all the various classes of NoSQL database and try to establish where they are the best fit.