Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
Initially, I was unaware of how this would cater to my career needs. But when I stumbled through the reviews given on the website. I went through many of them and found them all positive. I would... Read More
The project orientation is very much unique and it helps to understand the real time scenarios most of the industries are dealing with. And there is no limit, one can go through as many projects... Read More
Spark is the go-to-framework for today's big data processing. Most companies are jumping on the spark wagon.
However, Spark is notorious for being easy to get started with but being very difficult to master. The mastery of spark is beyond knowledge of its APIs but also knowledge of its internals. Because of this, there are many developers who in the face of production data use case begin to face unknown problems that were not discussed during training.
This Hackerday wishes to pick apart a couple of these tasks or scenarios that are not really discussed during trainings but can burden developers in practice.
We will look at the concept of Spark memory management, cluster resource allocation, clustering, repartitioning, and many more.
The goal of the Hackerday is to make Spark developers better at their craft and make those just learning spark to quickly appreciate the depths of the framework. The idea is to go beyond simple use cases into complex scenarios or data pipeline to enabled students to get the issues that come with the real world.
All the learning for the sessions will be done on Spark 2.
In this project, we will look at Cassandra and how it is suited for especially in a hadoop environment, how to integrate it with spark, installation in our lab environment.
This Elasticsearch example deploys the AWS ELK stack to analyse streaming event data. Tools used include Nifi, PySpark, Elasticsearch, Logstash and Kibana for visualisation.
In this project, we will look at running various use cases in the analysis of crime data sets using Apache Spark.