1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com
processing-web-server-log.jpg

Web Server Log Processing using Hadoop

In this hadoop project, you will be using a sample application log file from an application server to a demonstrated scaled-down server log processing pipeline.
4.84.8

Users who bought this project also bought

What will you learn

  • The benefits of log-mining in certain industries
  • A full log-mining application use-case
  • Using Flume to ingest log data
  • Using Spark to process data
  • Integrating Kafka to complex event alert
  • Using Impala for the low-latency query of processed log data.
  • Coordinating the data processing pipeline with Oozie.

What will you get

  • Access to recording of the complete project
  • Access to all material related to project like data files, solution files etc.

Project Description

Storing, processing and mining data from web server logs has become mainstream for a lot of companies today. Industry giants have used this engineering and the accompany science of machine learning to extract information that has helped in ads targeting, improved search, application optimization and general improvement in application's user experience.
In this hadoop project, we will be using a sample application log file from an application server to demonstrated a scaled-down server log processing pipeline. From ingestion to insight usually require Hadoop-ecosystem tools like Flume, Pig, Spark, Hive/Impala, Kafka, Oozie and HDFS for storage and this is what we will be looking at but holistically and specifically at each stage of the pipeline.

Prerequisite:

  1. It is expected that students have a fair knowledge of Big Data and Hadoop.
  2. Installation of the Cloudera quickstart vm is super-essential to get the best from this class. Instruction on how to setup a scala SDK and runtime can be found from here.

 

Instructors

 
Michael

Big Data & Enterprise Software Engineer

I am passionate about software development, databases, data analysis and the android platform. My native language is java but no one has stopped me so far from learning and using angular and node.js. Data and data analysis is thrilling and so are my experiences with SQL on Oracle, Microsoft SQL Server, Postgres and MyS see more...

Curriculum For This Mini Project

 
  What are log files and types of log files
00:08:43
  Contents of a log file
00:09:07
  Uses of log files
00:19:27
  Process log file using Flume
00:10:04
  Ingest log data using Flume
00:07:10
  Using Spark to process data
00:07:08
  Downloads and Installations
00:02:23
  DoS Attacks and log files
00:07:19
  Using Apache Kafka for complex event processing
00:06:26
  Using Oozie to coordinate tasks
00:16:28
  Log file use-case
00:21:58
  Clone github repository and summary overview
00:06:02
  Lambda Architecture for Data Infrastructure
00:05:26
  Solution Architecture overview
00:06:16
  Implement Flume Agent
00:27:20
  Troubleshooting Flume
00:29:47
  Spark Scale Execution
00:20:15
  Accumulator and execute hive table
00:14:03
  Impala execution
00:15:32
  Coordination tasks using Oozie
00:16:20
  Hue Workflow
00:02:37
  Running Oozie on command line
00:05:09