1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com

Integrating Spark and NoSQL Database for Data Analysis

In this project, we will look at two database platforms - MongoDB and Cassandra and look at the philosophical difference in how these databases work and perform analytical queries.
What are the prerequisites for this project?
  • It is expected that students have a fair knowledge of Big Data and Hadoop, particularly Spark.
  • Installation of a Hadoop quickstart VM.
  • Installation of MongoDB and Cassandra in your VM or host machine.

What will you learn

  • Introduction to NoSQL Document store - MongoDB
  • Introduction to NoSQL Wide column store - Cassandra
  • Use cases for Spark storage to NoSQL databases
  • Spark I/O connectors
  • Querying our NoSQL databases

Project Description

Spark has a benefit of being very extensible to quite a number of storage platforms beyond Hadoop. This means that as spark developers, we can write and read from virtually any popular storage platform while building our data pipeline.
In this Hackerday, we will look at two such database platforms - MongoDB and Cassandra. These are two different databases or classes and have their use suited for different use cases. We will discuss these and install both platforms in our lab environment, look at the philosophical difference in how these databases work, create sample tables and finally integrate our spark application to load the UK MOT vehicle testing dataset into them. Once loaded, anyone can at any time, perform analytical queries on the tables.



Big Data & Enterprise Software Engineer

I am passionate about software development, databases, data analysis and the android platform. My native language is java but no one has stopped me so far from learning and using angular and node.js. Data and data analysis is thrilling and so are my experiences with SQL on Oracle, Microsoft SQL Server, Postgres and MyS see more...

What is Hackerday?

Stay updated in technology trends by working on projects

Live online coding sessions led by industry experts

Build 2-4 projects a month each lasting 6 hours designed to teach you advanced concepts

Code in groups and connect with your community