1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com
processing-web-server-log.jpg

Web Server Log Processing using Hadoop

In this hadoop project, you will be using a sample application log file from an application server to a demonstrated scaled-down server log processing pipeline.

Users who bought this project also bought

What will you learn

  • The benefits of log-mining in certain industries
  • A full log-mining application use-case
  • Using Flume to ingest log data
  • Using Spark to process data
  • Integrating Kafka to complex event alert
  • Using Impala for the low-latency query of processed log data.
  • Coordinating the data processing pipeline with Oozie.

What will you get

  • Access to recording of the complete project
  • Access to all material related to project like data files, solution files etc.

Project Description

Storing, processing and mining data from web server logs has become mainstream for a lot of companies today. Industry giants have used this engineering and the accompany science of machine learning to extract information that has helped in ads targeting, improved search, application optimization and general improvement in application's user experience.
In this hadoop project, we will be using a sample application log file from an application server to demonstrated a scaled-down server log processing pipeline. From ingestion to insight usually require Hadoop-ecosystem tools like Flume, Pig, Spark, Hive/Impala, Kafka, Oozie and HDFS for storage and this is what we will be looking at but holistically and specifically at each stage of the pipeline.

Prerequisite:

  1. It is expected that students have a fair knowledge of Big Data and Hadoop.
  2. Installation of the Cloudera quickstart vm is super-essential to get the best from this class. Instruction on how to setup a scala SDK and runtime can be found from here.

 

Instructors

 
Michael

Big Data & Enterprise Software Engineer

I am passionate about software development, databases, data analysis and the android platform. My native language is java but no one has stopped me so far from learning and using angular and node.js. Data and data analysis is thrilling and so are my experiences with SQL on Oracle, Microsoft SQL Server, Postgres and MyS see more...

Curriculum For This Mini Project

 
  What are log files and types of log files
08m
  Contents of a log file
09m
  Uses of log files
19m
  Process log file using Flume
10m
  Ingest log data using Flume
07m
  Using Spark to process data
07m
  Downloads and Installations
02m
  DoS Attacks and log files
07m
  Using Apache Kafka for complex event processing
06m
  Using Oozie to coordinate tasks
16m
  Log file use-case
21m
  Clone github repository and summary overview
06m
  Lambda Architecture for Data Infrastructure
05m
  Solution Architecture overview
06m
  Implement Flume Agent
27m
  Troubleshooting Flume
29m
  Spark Scale Execution
20m
  Accumulator and execute hive table
14m
  Impala execution
15m
  Coordination tasks using Oozie
16m
  Hue Workflow
02m
  Running Oozie on command line
05m