A recent report by Databricks Inc. the primary commercial steward for Spark highlighted that the technology is growing exponentially.Apache Spark growth is driven by the use of SQL, machine learning and streaming analytics. Over 900 organizations and 1615 participants responded to the survey including data scientists, data architects, data engineers, and others.This survey highlighted the fastest growing areas of the Apache Spark Ecosystem. 39% of the spark users were leveraging the machine learning library MLlib, 57% are streaming users,67% are Spark SQL users and the top growth area is the DataFrame API with 153% users.
If you would like more information about Apache Spark Certification training, please click the orange "Request Info" button on top of this page.
Apache Spark version 2.0.1 was released recently, it was a maintenance release containing 300 stability and bug fixes.This release is based on the branch-2.0 maintenance branch of Spark. It is strongly recommended all 2.0.0 users to upgrade to this stable release.
(Source : https://spark.apache.org/releases/spark-release-2-0-1.html)
A recent survey from Databricks shows how Spark’s momentum has increased exponentially in the past year and is popular than ever. The survey states that the number of users has increased to threefold from 2015 totalling up to 225,000.This shows the increased adoption of Apache Spark amongst businesses.
For the complete list of big data companies and their salaries- CLICK HERE
Apache Spark has gained prominence in the big data domain because of its parallel data processing capabilities. It allows you to easily develop rapid big data applications for machine learning, stream processing and analytics graph. A key concern while handling Big Data is speed. A notable difference between Spark vs Hadoop MapReduce is that Spark has an optimized "directed acyclic graph (DAC) -execution engine, which results in an efficient query plan for data transformations.
SlamData Inc., the company which is building industry's first complete BI solution for complex modern data announced the release of Slamdata 4.0. "SlamData's mission has always been to completely solve the biggest problem facing enterprise BI — data chaos." said Jeff Carr, SlamData's CEO and cofounder. Slamdata 4.0 provides new connectors for modern data sources by rendering support for MarkLogic, MongoDB, Spark on Hadoop and CouchBase.
(Source : http://www.prnewswire.com/news-releases/slamdata-40-adds-analytic-support-for-apache-spark-couchbase-and-marklogic-300346314.html)
Spark architecture had a very limited role to play in the big data architecture at Webtrends, the company that collects user activity data from websites and mobile devices.However, now Apache Spark plays a critical role in the updated version of the new analytics platform.Apache Spark at the heart of Webtrends Infinity Analytics application. Webtrends set up a 160 node Apache Spark system for optimizing online marketing campaigns in real-time by analysing the activity data streaming into the Hadoop Clusters.
(Source : http://searchdatamanagement.techtarget.com/feature/Spark-architecture-finds-place-at-center-of-big-data-environments)
Alluxio founded by Haoyuan Li allows distributed computing frameworks like Apache Hadoop and Spark to access big data through a memory centric storage by providing a unified namespace across all the distributed storage systems.can be well thought off as a sophisticated cache for big data workloads. Alluxio launched Enterprise and Community edition of the software to monetize its work by rendering advanced features and providing support.
(Source : https://techcrunch.com/2016/10/26/alluxio-launches-its-memory-centric-storage-system-for-big-data-workloads/)