News on Apache Spark - December 2017
How CardinalCommerce grew its big data analytics capabilities. TechTarget.com, December 1, 2017.
The Mentor, Ohio based company CardinalCommerce Corp. purchased by Visa in 2017 generates huge amounts of financial transaction data and gleaning valuable insights from the data always been a top priority and central challenge. They build a small spark cluster on premises for computing basic data processing tasks,, like getting data back from CardinalCommerce’s payment platform for reporting purpose. Later, they moved Spark workloads to Amazon EMR big data service in cloud which gave them extra flexibility to scale Apache Spark for larger workloads as required. The lesson CardinalCommerce learnt by doing big data analytics with Spark is that it sparked a demand for data in the organization and increased number of in-house users engaged with the analytics process.
(Source - http://searchbusinessanalytics.techtarget.com/feature/How-CardinalCommerce-grew-its-big-data-analytics-capabilities )
If you would like more information about Apache Spark Training and Certification, click the Request Info button on top of this page.
AMD scores EPYC gig powering new Azure instances. TheRegister.co.uk, December 5, 2017.
AMD has acquired a place in the top tier cloud as it won Microsoft's business for the next generation of Azure L-series instances. AMS’s 32 core, 2.2 GHz EPYC 7551 will have a new Lv2 Azure instance type that is optimised for storage and as well workloads like Apache Spark. AMD regarded this win as a triumph to showcase that it is a force in the server CPU market again.
(Source : https://www.theregister.co.uk/2017/12/05/amd_epyc_to_power_new_azure_lv2_instances/ )
A Decade into Big Data. Datanami.com,December 11, 2017.
The first data oriented approach came in 2008- Hadoop , born out from a Google research paper started in 2006 and since then the big data evolution has been taken by a storm. Hadoop turned 10 in 2016. Another important contributor to the big data world has been Apache Spark that solved performance limitations and increased costs associated with using disk based storage approach in hadoop. With the advent of Spark, there was a huge shift from batch processing to real-time and event processing. Big data has moved through various stages -right from Hadoop era to spark and then to data lake and data fabrics. Data enthusiasts are wondering what’s next in big data evolution.As the use of sensors and other related technologies evolve , the data will be streamed into big data clusters for real time analysis .As increasing number of companies adopt data science and machine learning,Apache Spark machine learning and Google Tensorflow will be the blockbuster machine learning tools for predictive analytics and deep learning.
(Source : https://www.datanami.com/2017/12/11/decade-big-data/ )
Accelerite takes single-pipeline approach to data transformation and analysis.SiliconAngle.com, December 12, 2017.
Cloud Management vendor Accelerite released ShareInsights 2.0 , an end-to-end self-service analytics platform for data preparation, data visualization, collaboration and online analytical processing, all from a single user interface. Accelerite’s platform prepares and queries terabytes of data in minutes. ShareInsights 2.0 runs on top of a Hadoop cluster and leverages existing Apache Spark instances for predictive analytics and machine learning tasks.The platform has 50+ connectors to connect with other data sources and 100+ analytical widgets for performing simple tasks like aggregation to complex ones like machine learning.
(Source : https://siliconangle.com/blog/2017/12/12/accelerite-takes-single-pipeline-approach-data-transformation-analysis/ )
Big data delivers higher revenue and faster growth. Betanews.com, December 19, 2017
60% of the organizations that adopt big data mention improved efficiency and increased productivity as one of the biggest gains with using big data.In 2014, Hadoop and Spark had high interest but low adoption but now 70% of enterprises are either using these big data technologies in production or have a plan to use them in future. 90% of organizations mention that moving away from legacy systems and investing in big data technologies like Hadoop and Spark has not just proved valuable in deriving meaningful insights but also helped them save money.
(Source : https://betanews.com/2017/12/19/big-data-revenue-growth/ )
Microsoft's cloud Big Data service cuts prices up to 52 percent. Zdnet.com, December 18, 2017.
Microsoft announced a cut in its pricing for HDInsight (HDI), its Azure cloud- hosted big data offering based on Hadoop, Spark, Storm, Kafka, Hive and Microsoft R Server. With the intent to make the prices far more competitive than Amazon AWS and EMR, it reduced the charges for R server by 80% and for HDInsight by 52%. The service offering still remains the same with 3-nines service level agreement making Microsoft as a differentiator from its competitors.
(Source : http://www.zdnet.com/article/microsofts-cloud-big-data-service-cuts-prices-up-to-52/ )
53% of Companies are adopting Big Data Analytics. Forbes.com, December 24,2017
Big data adoption has reached 53% in 2017 up from 17% in 2015. The leading early adopters of big data include telecom and financial sectors with data warehouse optimization being the top use case for it. The softwares that have gained popularity for big data are Apache Spark, Hadoop MapReduce and YARN. 30% of the organizations surveyed consider Apache Spark a critical component of big data strategies and 20% consider Hadoop MapReduce and YARN a critical component. The big data access methods that are most preferred by organizations include Spark SQL, Hadoop HDFS, Hadoop Hive and Amazon S3 with 73% of the organizations considering Spark SQL as the key component for implementing analytic strategies.
(Source : https://www.forbes.com/sites/louiscolumbus/2017/12/24/53-of-companies-are-adopting-big-data-analytics/#449c16a639a1 )
Artificial Intelligence Needs Big Data, and Big Data Needs AI. RTInsights.com, December 26, 2017.
Big Data and Artificial Intelligence have formed a symbiotic relationship with each other and they need each other to reap the fruit of what they promise. Mike Manchett, senior analyst with Taneja Group who has been observing the revolution in the AI market said that Apache Spark is the Spark for AI development using big data.Artificial Intelligence is a resource intensive environment and many organizations do not have the infrastructure for this. Under such circumstances, open source tools like Apache Spark make this proposition cost effective and compelling.Apache Spark has gained widespread adoption for its in-memory, real-time processing and fast machine learning at scale.
(Source : https://www.rtinsights.com/artificial-intelligence-needs-big-data-and-big-data-needs-ai/ )