1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com

Real-Time Log Processing using Spark Streaming Architecture

In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security
What are the prerequisites for this project?
  • Cloudera QuickStart VM

What will you learn

  • Making a case for real time processing of log files
  • Getting logs at real time using Flume Log4J appenders
  • Making a case for Kafka for log aggregation.
  • Storing log event as a time series datasets in HBase
  • Integrating Hive and HBase for data retrieval using query.
  • Troubleshooting

Project Description

A while back, we did web server access log processing using spark and hive. However, that processing was batch processing and in the lambda architecture, we will only be able to operate in the batch and serving layer.

In this big data project, we are going one step further by bringing processing to the speed layer of the lambda architecture which opens up more capabilities. One of such capability will be ability monitor application real time perform or measure real time comfort with applications or real time alert in case of security breach.

The abilities and functionalities will be explored using Spark Streaming in a streaming architecture. 

Note: It is worthy of note that the Cloudera QuickStart VM does not have Kafka. However, like in our objective, we will make the case for using Kafka but our implementation will not be using Kafka. Instead, we will integrate the log agent with Spark streaming in this big data project.



Big Data & Enterprise Software Engineer

I am passionate about software development, databases, data analysis and the android platform. My native language is java but no one has stopped me so far from learning and using angular and node.js. Data and data analysis is thrilling and so are my experiences with SQL on Oracle, Microsoft SQL Server, Postgres and MyS see more...

What is Hackerday?

Stay updated in technology trends by working on projects

Live online coding sessions led by industry experts

Build 2-4 projects a month each lasting 6 hours designed to teach you advanced concepts

Code in groups and connect with your community