Each project comes with 2-5 hours of micro-videos explaining the solution.
Get access to 50+ solved projects with iPython notebooks and datasets.
Add project experience to your Linkedin/Github profiles.
This is one of the best of investments you can make with regards to career progression and growth in technological knowledge. I was pointed in this direction by a mentor in the IT world who I highly... Read More
I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More
A while back, we did web server access log processing using spark and hive. However, that processing was batch processing and in the lambda architecture, we will only be able to operate in the batch and serving layer.
In this big data project, we are going one step further by bringing processing to the speed layer of the lambda architecture which opens up more capabilities. One of such capability will be ability monitor application real time perform or measure real time comfort with applications or real time alert in case of security breach.
The abilities and functionalities will be explored using Spark Streaming in a streaming architecture.
Note: It is worthy of note that the Cloudera QuickStart VM does not have Kafka. However, like in our objective, we will make the case for using Kafka but our implementation will not be using Kafka. Instead, we will integrate the log agent with Spark streaming in this big data project.
The goal of this hadoop project is to apply some data engineering principles to Yelp Dataset in the areas of processing, storage, and retrieval.
In this big data project, we will see how data ingestion and loading is done with Kafka connect APIs while transformation will be done with Kafka Streaming API.
In this Apache Spark SQL project, we will go through provisioning data for retrieval using Spark SQL.