Recap of Apache Spark News for July

Your monthly Apache Spark news fix for July 2016.Here's a lookback of the month's top news in Apache Spark community.

Get access to all Big Data Projects View all Big Data Projects

Last Updated: 12 Oct 2023 | BY ProjectPro

News on Apache Spark - July 2016

Apache Spark News for 2016

MongoDB “connects” with Apache Spark with its new connector. July 4, 2016. siliconANGLE.com

MongoDB Inc. has announced a new connector for Apache Spark – that will allow Spark developers and data scientists to use its database to work with rapidly moving data. Kelly Stirman, MongoDB’s VP of Strategy has said that MongoDB users have expressed great interest to work with the Spark connecter. MongoDB has just taken its Apache Hadoop connecter and enhanced it for Spark.

(Source: http://siliconangle.com/blog/2016/07/04/mongodb-cozies-up-to-apache-spark-with-new-connector/)

Sparkling Water 2.0 enables machine learning with Apache Spark. July 4, 2016.SiliconAngle.com

A new tool Sparkling Water 2.0 created by the startup H20.ai (earlier known as Oxdata Inc.) provides an open source platform for algorithmic development. Sparkling Water 2.0 makes the use of machine learning algorithms during data analysis easier. Instead of using Apache Spark’s machine learning library MLlib, Sparkling Water 2.0 application programming interface allows users to tap into H2O’s AI platform. The tool allows users to make the best use of Spark features along with its own columnar compression, fully featured machine learning algorithms and speed.

(Source: http://siliconangle.com/blog/2016/07/04/sparkling-water-2-0-enables-machine-learning-with-apache-spark/ )

All Apache Spark support are not the same. Choose wisely. July 7, 2016. InfoWorld.com

Apache Spark has become extremely popular since its launch in 2012. Since last year it has gained momentum in enterprise adoption. But for Apache Spark, all support is not the same. Customers should look at 4 main facets before using Spark libraries. How Spark is used in the platform, what is available in the Apache Spark package, how everyone in the team is exposed to Spark and how to perform analytics with the various libraries in Spark.

(Source: http://www.infoworld.com/article/3091105/analytics/take-a-closer-look-at-your-spark-implementation.html)

Splice Machine, which uses Hadoop and Spark, took its new RDBMS Sandbox live, in Amazon Web Services (AWS). June 18, 2016. StockTranscript.com

Splice machine which is an open source RDBMS, powered by Hadoop and Spark, today, and announced its new open source Sandbox for the use of developers. This new open source Sandbox 2.0 community edition is not up for test in AWS.

(Source:http://http://www.stocktranscript.com/splice-machines-new-open-source-rdbms-sandbox-goes-live-on-amazon-web-services-aws/96726/)

TIBCO’s Apache Spark accelerator is out and about to make the fast Spark faster. July 26, 2016. RTInsights.com

Hayden Schultz, the global architect for TIBCO talks about how to bring technology that is making waves in the industry, closer to the understanding and usage of the customer. TIBCO is all about building an application to boost a technology’s core feature. In the case of Apache Spark, the application will help build accelerated systems on top of Apache Spark to stimulate big data solutions.

(Source: https://www.rtinsights.com/first-look-video-tibcos-apache-spark-accelerator/)

Databricks unveils commercial support for Apache Spark 2.0.July 28, 2016.CIO.com

Apache Spark 2.0 is now available to users on the Databricks platform. Spark 2.0 is 5 to 10 times better in performance when compared to Spark 1.6 with support for applications requiring structured streaming. Tungsten's Phase 2 whole-stage-code generation and Catalysts code optimization adds on to the enhanced speed of Spark 2.0. The latest releases comes bundled with many novel features like - Machine learning model persistence, DataFrame-based machine learning APIs, standard SQL support, etc.

(Source: http://www.cio.com/article/3101842/analytics/databricks-unveils-commercial-support-for-apache-spark-2-0.html )

Apache MLlib — making practical machine learning easy and scalable. July 29, 2016.Jaxenter

Machine learning might seem to be futuristic, however, it is not. Apache Spark’s scalable machine learning library MLlib is making machine learning easy for machine learning engineers and data scientists. MLlib library does not only fits models but can also be used for various staging transformations like data collection, data labelling, feature extraction ,model tuning, model evaluation and deployment. MLlib library together with other Apache Spark components provide a unified solution to data scientists under a single big data framework.

(Source: https://jaxenter.com/apache-mllib-making-practical-machine-learning-easy-and-scalable-128037.html )

ProjectPro

ProjectPro is the only online platform designed to help professionals gain practical, hands-on experience in big data, data engineering, data science, and machine learning related technologies. Having over 270+ reusable project templates in data science and big data with step-by-step walkthroughs,

Meet The Author