1-844-696-6465 (US)        +91 77600 44484        help@dezyre.com

Explore features of Spark SQL in practice on Spark 2.0

The goal of this spark project for students is to explore the features of Spark SQL in practice on the latest version of Spark i.e. Spark 2.0.

Users who bought this project also bought

What will you learn

  • What is Spark SQL
  • Why you should think Spark SQL before Spark Core
  • When you are going to have to use Spark Core
  • Spark SQL and multiple file types: Text File, JSON File, RDBMS Sources, NoSQL Sources
  • Spark SQL for SQL-on-Hadoop server
  • Introduction to Spark Structured Streaming

What will you get

  • Access to recording of the complete project
  • Access to all material related to project like data files, solution files etc.


  • It is expected that students have a fair knowledge of Big Data and hadoop particularly HDFS, Spark and Hive.
  • Installation Cloudera QuickStart VM.
  • Since we will be doing the development in the Quickstart VM, it is essential to have the Scala SDK installed there as well. Instruction on how to set up a Scala SDK and runtime can be found at here.
  • In the class, we will do an installation of Spark 2 in the Cloudera Quickstart VM. By default, the VM comes pre-installed with Spark 1.6.x.

Project Description

Spark 2 offers a huge but yet backward-compatible break from the Spark 1.x, not only in terms of high-level API but also in performance. And spark the module with the most significant new features is Spark SQL.

In this apache spark project, we will explore a number of this features in practice.

We will discuss using various dataset, the new unified spark API as well as the optimization features that makes Spark SQL the first way to explore in processing structured data.

However, there are times when it is inevitable to resort to Spark Core - RDD in Spark 2. We will explore that as well alongside the newest and cool structured streaming API that enables fault-tolerant stream processing engine built on the Spark SQL engine.



Big Data & Enterprise Software Engineer

I am passionate about software development, databases, data analysis and the android platform. My native language is java but no one has stopped me so far from learning and using angular and node.js. Data and data analysis is thrilling and so are my experiences with SQL on Oracle, Microsoft SQL Server, Postgres and MyS see more...

Curriculum For This Mini Project