Recap of Apache Spark News for December

Recap of Apache Spark News for December

News on Apache Spark - December 2016

Spark News for December 2016

Databricks Expands Platform for Apache Spark Deployments in the Cloud.,December 12,2016

Databricks ,the team that created Apache Spark has announced addition of new capabilities that will simplify deployment of spark in the cloud. This new enhancement complements existing data science environment of Databricks that lets users analyze data in real-time with data science notebooks which can then be deployed as Apache Spark jobs in production.Ali Ghodsi, CEO and Co-Founder of Databricks said there is an unforeseen demand for a robust and secure apache spark platform in the cloud to run production workloads.


Learn Spark Online

With Apache Spark, Old Mainframes Learn New,December 26,2016.

Apache Spark, an open source framework, known for its speed of processing, and ETL related jobs, which constitutes around 55 percent of the reported use today, has been given a new dimension to its capabilities and, all thanks to IBM z/OS platform. This platform enables Apache Spark to run innately on mainframe systems, providing the users to analyze data on the system itself, which reduces the time it takes to move it to Hadoop for ETL jobs, in turns saving lots of money. Apart from this, Spark can be applied on mainframes in various other ways too, for fraud pattern detection, real time payment status, and targeted marketing to name a few. The main advantage of spark on mainframe is not only its speed, but accessing the data natively and analyzing the data on mainframe itself.(Source:

Databricks Raises $60M to Fuel Apache Spark,December 21,2016.

In a very short span of time Spark has become the most active project in big data with over 1K contributors, from 250 different organizations. With all these credentials on Spark’s name, Databricks, founded by the creators of Apache Spark, has closed $60 million funding from NEA, making it to achieve the $100 million mark. Powered by Apache Spark, Databricks indigenous just-in time data analytics platform on the cloud is now serving to around 400 clients providing data integration, real time data analysis and etc.(Source:

Apache Spark News



Relevant Projects

Yelp Data Processing using Spark and Hive Part 2
In this spark project, we will continue building the data warehouse from the previous project Yelp Data Processing Using Spark And Hive Part 1 and will do further data processing to develop diverse data products.

Web Server Log Processing using Hadoop
In this hadoop project, you will be using a sample application log file from an application server to a demonstrated scaled-down server log processing pipeline.

Finding Unique URL's using Hadoop Hive
Hive Project -Learn to write a Hive program to find the first unique URL, given 'n' number of URL's.

Airline Dataset Analysis using Hadoop, Hive, Pig and Impala
Hadoop Project- Perform basic big data analysis on airline dataset using big data tools -Pig, Hive and Impala.

Data Mining Project on Yelp Dataset using Hadoop Hive
Use the Hadoop ecosystem to glean valuable insights from the Yelp dataset. You will be analyzing the different patterns that can be found in the Yelp data set, to come up with various approaches in solving a business problem.

Hadoop Project-Analysis of Yelp Dataset using Hadoop Hive
The goal of this hadoop project is to apply some data engineering principles to Yelp Dataset in the areas of processing, storage, and retrieval.

Hive Project - Visualising Website Clickstream Data with Apache Hadoop
Analyze clickstream data of a website using Hadoop Hive to increase sales by optimizing every aspect of the customer experience on the website from the first mouse click to the last.

Data processing with Spark SQL
In this Apache Spark SQL project, we will go through provisioning data for retrieval using Spark SQL.

Data Warehouse Design for E-commerce Environments
In this hive project, you will design a data warehouse for e-commerce environments.

Hadoop Project for Beginners-SQL Analytics with Hive
In this hadoop project, learn about the features in Hive that allow us to perform analytical queries over large datasets.