Building Real-Time Data Pipelines with Kafka Connect

Building Real-Time Data Pipelines with Kafka Connect

In this big data project, we will see how data ingestion and loading is done with Kafka connect APIs while transformation will be done with Kafka Streaming API.

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Mohamed Yusef Ahmed

Software Developer at Taske

Recently I became interested in Hadoop as I think its a great platform for storing and analyzing large structured and unstructured data sets. The experts did a great job not only explaining the... Read More

James Peebles

Data Analytics Leader, IQVIA

This is one of the best of investments you can make with regards to career progression and growth in technological knowledge. I was pointed in this direction by a mentor in the IT world who I highly... Read More

What will you learn

Kafka and Data warehousing
Real-time data warehousing
Kafka Connect API
Kafka Streams API
End-to-end Kafka pipeline

Project Description

Lately, the phrase "ETL is dead" has become more popular. But that statement is flatly false. It should rather have been "Batch ETL is growing unpopular". Companies now believe not only in the power of data but also in the power of current-ness of data. This means that a dashboard that reveals sales pattern for yesterday is less correct than one that shows sales pattern in the last 30 minutes.

Kafka is a scalable and distributed streaming and messaging platform is a great choice for building today's ETL pipeline.

In this big data kafka project, we will see this in theory as well as implementation. We will see how data ingestion and loading is done with Kafka connect APIs while transformation will be done with Kafka Streaming API. But this is not all.

Similar Projects

Hadoop Projects for Beginners -Learn data ingestion from a source using Apache Flume and Kafka to make a real-time decision on incoming data.

In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security

The goal of this IoT project is to build an argument for generalized streaming architecture for reactive data ingestion based on a microservice architecture. 

Curriculum For This Mini Project

1-Dec-2017
02h 43m
8-Dec-2017
03h 29m