Real-time Auto Tracking with Spark-Redis

Real-time Auto Tracking with Spark-Redis

Spark Project - Discuss real-time monitoring of taxis in a city. The real-time data streaming will be simulated using Flume. The ingestion will be done using Spark Streaming.

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

SUBHABRATA BISWAS

Lead Consultant, ITC Infotech

The project orientation is very much unique and it helps to understand the real time scenarios most of the industries are dealing with. And there is no limit, one can go through as many projects... Read More

Mohamed Yusef Ahmed

Software Developer at Taske

Recently I became interested in Hadoop as I think its a great platform for storing and analyzing large structured and unstructured data sets. The experts did a great job not only explaining the... Read More

What will you learn

Understanding the project , project requirement, and walkthrough of project
Setting up Cloudera VM for creating a virtual environment on eclipse
Using Redis as Server on VM
Key-Value (NoSQL) Databases
Integrating Database and Application writing queries
Understanding Redis, datatypes supported by it and its uses
Using Redis as a pub/sub-message-oriented middleware
Using Redis as a caching server/persistence store
Learning commands of Redis for performing desired tasks like Publishing Subscribe
Downloading and loading the T-drive Trajectory Dataset
Extending the application to production gate using Mongo DB
Integrating of Flume using Avro Sink
Streaming data with Flume/Spark integration
Real-time processing and display of streamed data on a "dashboard".
Compare the capability of Redis as a pub/sub middleware with that of the industry appraised Apache Kafka
How to do Debugging
Extending the application to production-grade

Project Description

The era of IOT brought with it the need to stream data, process and sometimes display its information in real or near-real time. 
In this spark streaming project, we will be using a dataset that passes for real-time data sensor feeds for tracking auto vehicles around the city of Bejing. We will track each vehicle as the signal is received from our streaming simulation (using Flume). We will receive the streams of data using Spark Streaming and use the Redis as a pub/sub middleware.

Furthermore, we will use a java swing based application to display real-time information about all vehicles being tracked. While tracking the vehicle, we will be looking for indexes like current speed, total time and distance covered.

While this spark project is about tracking autos, the principles shared in this big data project will cover wide areas of implementing real-time sensor data processing and much more IOT.

Similar Projects

This Elasticsearch example deploys the AWS ELK stack to analyse streaming event data. Tools used include Nifi, PySpark, Elasticsearch, Logstash and Kibana for visualisation.

Hadoop Projects for Beginners -Learn data ingestion from a source using Apache Flume and Kafka to make a real-time decision on incoming data.

In this hadoop project, we are going to be continuing the series on data engineering by discussing and implementing various ways to solve the hadoop small file problem.

Curriculum For This Mini Project

Discussion on Project Requirements
04m
Walkthrough of the Application
01m
Start Cloudera VM and Put Application on Eclipse
01m
Agenda for the Session
03m
Install and Start Redis Server on Quickstart VM
02m
Introduction to NoSQL Systems
02m
Integration of Application at Database Level
08m
What is Redis?
04m
Datatypes in Redis
01m
Use Cases for Redis
06m
Exploring and Working with Redis Commands
14m
How to Use Redis as a Persistent Store?
01m
What are Messaging Servers?
02m
Redis - Publish-Subscribe
08m
Commands to use Redis for Publish Subscribe
02m
Redis Drivers -Jedis
01m
Open Quickstart VM and Fire Up Eclipse
02m
Introduction to T-Drive Trajectory Dataset
02m
Overview of the Application -Project in a Nutshell
10m
Use of MongoDB to extend the application to production grade
05m
Recap of the Previous Session
05m
Agenda for the Session
02m
Real-Time Analytics Use Cases on Streaming Data
08m
Spark Streaming -Real-Time and Near Real-Time Streaming
05m
Flume for Real-Time Streaming
05m
Spark Streaming Concepts
04m
Flume Integration
01m
Using Avro Sink for Flume Integration
05m
Redis Performance Benchmarks
02m
Processing logic to Initiate Streaming and Streaming Content
04m
Spark Streaming Execution
04m
Flume Agent Connection to Listen to Data (Run the Flume Agents)
01m
Dashboard Visualization of Taxis in Real-Time
08m
Sequence of Commands for Execution
01m
Debugging
02m
Making Redis Faster with Buffer as ConcurrentLinkedQueue
05m
Redis vs Kafka
06m
Processing Completed
01m