Recap of Apache Spark News for February

Recap of Apache Spark News for February

News on Apache Spark - February 2016

Apache Spark News

IBM has launched a host of cloud services to facilitate Apache Spark. February 4, 2016,

IBM admires Spark and has made it known by launching a host of cloud services to bolster the Apache Spark performance with NoSQL, graph and machine learning capabilities. IBM has surrounded Spark with new engines which can directly feed Spark with more data for analysis and also generate insights from Apache Spark at real time.

(Source: )

Work on Hands on Projects in Apache Spark

Atigeo’s CTO and Principle Lead Engineer will present at the Apache Spark Summit in New York from February 16-18. February 4, 2016, 

At the largest big data event specifically dedicated to Apache Spark, which will be held at New York Hilton Midtown from Feb 16-18 – David Talby, the CTO at Atigeo, will speak on fraud detection with big data technologies like Apache Spark. Atigeo has been a compassionate technology company which takes a stab at building solutions for the most pressing problems of the moment.

(Source: )

Apache Spark has become the most active Big Data open source project in 2016. February 8, 2016. 

Apache Spark is all set to drive innovation in the big data world. With fast adoption across enterprises, it is becoming a favorite with companies, to create Big Data solutions. A recent survey by the big data company Syncsort showed that companies who are going for big data solutions are more interested in adopting Spark than MapReduce. Nearly 70% of the respondents chose Spark as a preferred technology.

(Source: )

Apache Spark lights the way for extensive Big Data Adoption. February 10, 2016. 

10 years ago when Hadoop came to the picture, companies started adopting Big Data to solve business problems. But due the complexity of working with MapReduce in Hadoop, smaller companies shied away from Big Data as they could not afford to hire specialized talent to work with MapReduce. With the arrival of Apache Spark, Big Data adoption has spread to a wider audience, as Spark eliminates the use of MapReduce and it 100 times faster in data analysis.

(Source: )

Apache Spark is becoming a favorite among UK firms for big data projects. February 16, 2016. 

At a recent study conducted by Computing, where more than 500 CTOs, CIOs, responded – declared Spark to be a popular choice for big data solutions. While most firms are still using Hadoop, Apache Spark is catching up with an adoption increase on 32%.

(Source: )

Apache Spark rides on Hadoop into Enterprise adoption. February 17, 2016. 

During a Spark Summit at New Your, Forrester Inc.’s analyst Mike Gualtieri said that Spark tends to come into the enterprise piggybacking on big brother Hadoop. Mr. Gualtieri views the two technologies as complementary as many Hadoop vendors have included Spark in their distributions. ‘Hadoop was built for volume and Spark is built for speed’, Mr. Gualtieri says.

(Source: )

Learn Apache Spark Online Now to upgrade your big data skillset!

Apache Spark spreads its wings with SAP adoption. February 17, 2016. 

With buy-in in mobile analytics, SAP has announced that they will be heavily relying on Apache Spark to support its SAP Predictive Analytics 2.5 software. This software provides a performance-boosting native Spark modelling techniques for developers and data scientists working in Hadoop-based environments.

(Source: )

Apache Spark is all set for software update as the new 2.0 version rolls out. February 25, 2016. 

At the Spark Summit East in New York, the creator of Apache Spark, Matei Zaharia, said that a new version of Apache Spark will be rolling out in either April or May this year. Everyone was assured that there will be no major changes in the APIs. The major overhauling is directed towards data streaming with Spark streaming.

(Source: )

For the complete list of big data companies and their salaries- CLICK HERE



Work on hands on projects on Apache Spark with Industry Professionals

Relevant Projects

Create A Data Pipeline Based On Messaging Using PySpark And Hive - Covid-19 Analysis
In this PySpark project, you will simulate a complex real-world data pipeline based on messaging. This project is deployed using the following tech stack - NiFi, PySpark, Hive, HDFS, Kafka, Airflow, Tableau and AWS QuickSight.

Web Server Log Processing using Hadoop
In this hadoop project, you will be using a sample application log file from an application server to a demonstrated scaled-down server log processing pipeline.

Hadoop Project for Beginners-SQL Analytics with Hive
In this hadoop project, learn about the features in Hive that allow us to perform analytical queries over large datasets.

Hive Project - Visualising Website Clickstream Data with Apache Hadoop
Analyze clickstream data of a website using Hadoop Hive to increase sales by optimizing every aspect of the customer experience on the website from the first mouse click to the last.

Spark Project-Analysis and Visualization on Yelp Dataset
The goal of this Spark project is to analyze business reviews from Yelp dataset and ingest the final output of data processing in Elastic Search.Also, use the visualisation tool in the ELK stack to visualize various kinds of ad-hoc reports from the data.

Tough engineering choices with large datasets in Hive Part - 1
Explore hive usage efficiently in this hadoop hive project using various file formats such as JSON, CSV, ORC, AVRO and compare their relative performances

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Movielens dataset analysis for movie recommendations using Spark in Azure
In this Databricks Azure tutorial project, you will use Spark Sql to analyse the movielens dataset to provide movie recommendations. As part of this you will deploy Azure data factory, data pipelines and visualise the analysis.

Explore features of Spark SQL in practice on Spark 2.0
The goal of this spark project for students is to explore the features of Spark SQL in practice on the latest version of Spark i.e. Spark 2.0.

Real-Time Log Processing using Spark Streaming Architecture
In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security