Work with Streaming Data using Twitter API to Build a JobPortal

In this spark streaming project, we are going to build the backend of a IT job ad website by streaming data from twitter for analysis in spark.

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

What will you learn

  • Streaming twitter using flume

  • Integrating flume with Spark-Streaming for processing twitter events

  • Data processing with Spark

  • Integrating Kafka to complex event alert

  • Integrating spark with online databases

  • Coordinating the data processing pipeline with Oozie

Project Description

In this spark project, we are going to be building a business. Yes, a business that is similar to a IT job ad site. This Job portal will stream data from twitter to locate recently published IT jobs, process them and make them available via a simple search api. Also, to complete the circle, we will be building notification features to user who subscribe for job ads notification.

On completion of this big data project, we will provide a job portal for every IT job tweeted and provide an apply-early advantage to users.

Similar Projects

Big Data Project Web Server Log Processing using Hadoop
In this hadoop project, you will be using a sample application log file from an application server to a demonstrated scaled-down server log processing pipeline.
Big Data Project Real-Time Log Processing using Spark Streaming Architecture
In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security
Big Data Project Making real time decision on incoming data using Flume and Kafka
Hadoop Projects for Beginners -Learn data ingestion from a source using Apache Flume and Kafka to make a real-time decision on incoming data.
Big Data Project Hadoop Project-Analysis of Yelp Dataset using Hadoop Hive
The goal of this hadoop project is to apply some data engineering principles to Yelp Dataset in the areas of processing, storage, and retrieval.

Curriculum For This Mini Project

 
  9-Dec-2016
02h 33m
  10-Dec-2016
02h 34m