With Apache Spark booming and its community growing at a rapid pace, spark is making waves in the big data ecosystem.Though Spark in the cloud is nothing new , Databricks is announcing it latest addition Delta - smart cache layer in the cloud which will offer scalability and elasticity in the cloud.A smart cache layer like Delta brings an array of benefits for people working in the cloud only if they are willing to shell out big bucks. However, Databricks major focus is on growing its proprietary platform by making streaming and deep learning work together in the cloud.
Source : www.zdnet.com/article/the-future-of-the-future-spark-big-data-insights-streaming-and-deep-learning-in-the-cloud/)
Azure users interested in gleaning meaningful business insights by parsing huge amounts of data will soon be able to use Azure Databricks built around the popular open source big data framework and developed in collaboration with Databricks. The first Spark-as-a-service of any of the cloud vendors , Azure Databricks will be used to model real-time data patterns. For instance, the platform would be used to measure how guests in a hotel move around the lobby so the hotel can decide on the best place furniture and guest service.
(Source : https://www.geekwire.com/2017/microsoft-launches-azure-databricks-new-cloud-data-platform-based-apache-spark/ )
Shawn Dolley, global industry leader of health and life sciences at Cloudera said - “Spark "is becoming the lingua franca of research computing pipeline generation”. Earlier Cloudera was a support organization for most of the big data technologies but now one third of the demand for Cloudera services is from folks working on computational pipelines and they want it to be in Apache Spark. Cloudera ( of which Intel holds a stake of 18% ) is among the leading providers of support for Apache Spark when it comes to clinical data.
(Source : https://www.genomeweb.com/informatics/cloudera-bets-its-future-scalability-spark-gatk-support )
Qubole is making Apache spark more easier and flexible to use by providing its customers with ability to run Apache Spark applications on AWS Lambda service. The ability to execute Spark apps on Lambda , a serverless compute service will require customers to pay only for the compute power without having to use servers , making its platform elastic and efficient in terms of resource usage. This will overcome two major problems that previously made running Spark applications on Lambda a challenging task.The first one is Spark’s inability to communicate directly with AWS Lambda service and the other is AWS Lambda’s runtime resources that are limited to a maximum runtime duration of 5 minutes, 512 MB disk space and 1536 MB memory.
(Source : https://siliconangle.com/blog/2017/11/22/big-data-company-qubole-brings-apache-spark-aws-lambda/ )
Microsoft has incorporated various third-party platforms on its Azure Cloud to help data analysts and developers. It’s latest Azure capabilities include a beta Spark cluster computing platform named Azure Databricks that will help data analysts and developers glean insights from enterprise data.Developers can sign up for the beta version of Azure Databricks.
(Source : http://www.techcentral.ie/azure-cloud-gets-apache-spark-cassandra-mariadb/)