Recap of Hadoop News for August

Recap of Hadoop News for August

News on Hadoop-August 2016

Hadoop News

Latest Amazon Elastic MapReduce release supports 16 Hadoop projects. TechCrunch, August 2, 2016

Amazon released its latest version of Elastic MapReduce (EMR) 5.0 that is aimed to help data scientists and other interested parties looking to manage big data projects with hadoop. The EMR release includes support for 16 open source Hadoop projects.

( Source: )

Hadoop vendor MapR raises another $50M as it sets its sights on IPO. TechCrunch, August 9, 2016

As the use and adoption of Hadoop continues to grow among enterprise, MapR has announced $50 million funding to get ready for an IPO and secure its place as a leading hadoop vendor. Hadoop market is a tough three-horse race between MapR, Cloudera and Hortonworks.Cloudera has raised $1 billion till date and Hortonworks made it to the strong public listing debut in 2014.

(Source: )

MapR and Hortonworks announce Hadoop’s position in commercial market. August 10, 2016.

MapR and Hortonworks - two of the biggest Hadoop distributions, announced their deals and gave a fair idea of how Hadoop stands in the current commercial market. MapR just published that they for their 5th round of funding - for at least $50 million and Hortonworks, though had a disappointing Q2 2016 results, and came in with the announcement of their second product line.


Hadoop accelerates with Apache Ignite. August 16, 2016.

Hadoop is definitely faster with in-memory caching. To speed up the data processing all round, you need to speed up the HDFS file access. Hortonworks DataFlow is an integrated platform that makes data ingestion and processing easier and faster in Hadoop.


Processing on Hadoop Clusters has to be optimized to increase the performance of Hadoop Clusters., August 19, 2016.

The challenge of using the distributed computing system of hadoop that everyone wants the Hadoop cluster for their own work exclusively. But the whole point on why Hadoop was opted for rather than Mainframes is because of it processing speed and storage capacity. Pepperdata - provides one such solution for the health check of the Hadoop clusters. This is done so that everyone can get the processing time in the Hadoop clusters for their own work.


For the complete list of big data companies and their salaries- CLICK HERE

Analysts remain bullish on Hadoop even as Spark threatens to steal the show. SiliconAngle, August 26, 2016

Apache Spark might be the hottest technology today in the big data world but Research and Market analysts predict that Hadoop continues to be as hot as ever. The global hadoop market is anticipated to grow at a CAGR of 63.4% in the next 5 years reaching $84.6 billion by 2021. Hadoop is likely to witness strong enterprise demand in Europe with annual growth rates reaching 65%. “Factors such as aggrandized generation of structured and unstructured data and efficient and affordable data processing services offered by Hadoop technology are the major drivers of the market,” the researchers said in a statement.

(Source: )

SAP is buying Hadoop-as-a-service startup Altiscale for over $125M. SiliconAngle, August 26, 2016

SAP SA is all set to take the final step on acquiring a Palo Alto based startup Altiscale Inc. for $125 million. Altiscale provides managed data processing services. SAP plans to integrate Altiscale services into its cloud based analytics portfolio that is based on it in-memory HANA database. Altiscale has a premium edition known as Altiscale Insight Cloud that layers BI functionalities on top of the Hadoop system. SAP plans to make use of the premium edition service of Altiscale to provide hosted Hadoop implementation available to its customers to bridge the functionality gaps in its line-up and solve a new set of use cases that it was not able to support earlier.

(Source: )

Teradata ports Aster analytics to Hadoop., August 30, 2016

Earlier users had to purchase the Aster database to make use of Aster analytic capabilities but now Teradata has decoupled Aster Analytics from its underlying database .The latest version of Aster Analytics will render unique analytics functions of Aster as software-only for Hadoop and AWS cloud.

(Source: )



Online Hadoop Training

Relevant Projects

Spark Project -Real-time data collection and Spark Streaming Aggregation
In this big data project, we will embark on real-time data collection and aggregation from a simulated real-time system using Spark Streaming.

Yelp Data Processing using Spark and Hive Part 2
In this spark project, we will continue building the data warehouse from the previous project Yelp Data Processing Using Spark And Hive Part 1 and will do further data processing to develop diverse data products.

Real-time Auto Tracking with Spark-Redis
Spark Project - Discuss real-time monitoring of taxis in a city. The real-time data streaming will be simulated using Flume. The ingestion will be done using Spark Streaming.

Implementing Slow Changing Dimensions in a Data Warehouse using Hive and Spark
Hive Project- Understand the various types of SCDs and implement these slowly changing dimesnsion in Hadoop Hive and Spark.

Movielens dataset analysis using Hive for Movie Recommendations
In this hadoop hive project, you will work on Hive and HQL to analyze movie ratings using MovieLens dataset for better movie recommendation.

Yelp Data Processing Using Spark And Hive Part 1
In this big data project, we will continue from a previous hive project "Data engineering on Yelp Datasets using Hadoop tools" and do the entire data processing using spark.

Tough engineering choices with large datasets in Hive Part - 2
This is in continuation of the previous Hive project "Tough engineering choices with large datasets in Hive Part - 1", where we will work on processing big data sets using Hive.

Design a Hadoop Architecture
Learn to design Hadoop Architecture and understand how to store data using data acquisition tools in Hadoop.

Analysing Big Data with Twitter Sentiments using Spark Streaming
In this big data spark project, we will do Twitter sentiment analysis using spark streaming on the incoming streaming data.

Finding Unique URL's using Hadoop Hive
Hive Project -Learn to write a Hive program to find the first unique URL, given 'n' number of URL's.