Recap of Hadoop News for May 2017

Hadoop Monthly News Updates-Learn what happened in the world of big data and hadoop in May 2017!

Recap of Hadoop News for May 2017
 |  BY ProjectPro

News on Hadoop - May 2017

Apache Hadoop News for May 2017

High-end backup kid Datos IO embraces relational, Hadoop data.theregister.co.uk , May 3 , 2017.

Datos IO has extended its on-premise and public cloud data protection to RDBMS and Hadoop distributions. Its RecoverX distributed database backup product of latest version v2.0 now provides hadoop support. RecoverX is described as app-centric and can back up applications data whilst being capable of recovering it at various granularity levels to enhance storage efficiency. By offering support for RDBMS and Hadoop, Datos IO has become a data protection and migration platform for its customers.

(Source : https://www.theregister.co.uk/2017/05/03/relational_and_big_data_backup_refresh/ )

Access Solved Big Data and Data Science Projects

Cloudera IPO Highlights The Big Data And Hadoop Opportunity. Forrester.com, May 4, 2017.

Cloudera completed an IPO, raising an equity capital of $259 million with shares priced at $15 and traded up to more than $18 per share.Cloudera’s customer base is primarily Global 8000 companies that account for 73% of its revenues.According to Forrester, Hadoop is here to stay but is likely to face many challenges as increasing number of organizations wish to use Hadoop in the cloud.Cloudera  is addressing this demand by bringing hadoop to the cloud by partnering with Century Link, Google, AWS, and Microsoft. Forrester predicts that Hadoop deployments are likely to be the fastest growing piece in the Hadoop ecosystem in future.

(Source: http://blogs.forrester.com/jennifer_adams/17-05-04-cloudera_ipo_highlights_the_big_data_and_hadoop_opportunity?cm_mmc=RSS-_-BT-_-71-_-blog_10164 )

Chameleon Speeds Development of Portable Hadoop Reader for Parallel File Systems.hpcwire.com, May 4, 2017.

It is difficult to move data between HDFS and PFS so scientist who want to make the best use of analytics on Hadoop should copy the data from parallel file systems. This might slow the workflows to a crawl, in particular those that have terabytes of data. Scientists working in Xian-He Sun’s  group are resolving this issue issue with a cross-platform hadoop reader known as PortHadoop which moves data directly from the parallel file system to Hadoop’s memory instead of copying it from disk to disk. PortHadoop uses the concept of virtual blocks which help bridge both the systems by mapping the data from parallel file systems directly into Hadoop memory by creating a virtual HDFS environment.

(Source : https://www.hpcwire.com/off-the-wire/chameleon-speeds-development-portable-hadoop-reader-parallel-file-systems/ )

Big Data Projects

That giant sucking sound? Hadoop moving into the cloud. EnterpriseIrregulars.com, May 5, 2017.

Hadoop was never meant for on premise enterprise deployment, however many early deployments were on premise owing to several management overheads. Though Hadoop in the cloud promises elastic scalability, capacity planning is an issue. Hadoop distribution providers can mitigate these issues to a certain extent , however a better alternative is to use managed services from Azure, AWS and Google Cloud Platform. Google Cloud Platform (GCP) has witnessed several wins for its big data services with a major one being for Spotify which stated that it is willing to trade openness for convenience and extra capability. Several other organizations that have opted for Google Cloud Platform include HSBC, Qubit, and Ocado. The benefits of moving to the cloud are obvious - data governance issues are easily dealt with and the advantages of getting data into the cloud from a manageability perspective are too significant to be ignored. Moreover, big data and hadoop workloads make more sense off premise.
 (Source : https://www.enterpriseirregulars.com/115048/giant-sucking-sound-hadoop-moving-cloud/ )

Has the Hadoop market turned a corner? Zdnet.com, May 8, 2017.

Cloudera’s long awaited IPO and Hortonworks Q1 results show positive signs towards the path to profitability with their product revenue growth surpassing professional services.  Cloudera has been growing at a steady rate of 50% in the last 3 years whereas Hortonworks settled at this rate only in the last year.Hortonworks got into $100 million mark a lot quicker. Cloudera is more inclined on becoming a product centric business  with 23% of its revenue coming from services past year in comparison to 31% for Hortonworks. Hadoop is a land and expand business for vendors like Cloudera and Hortonworks. With 90% of existing customers renewing and expanding their subscriptions, there seems to be good news for Cloudera and Hortonworks. For these hadoop vendors, the big data market is all about big and fast data that includes cloud based services for Hadoop and other offerings for running Spark , big data pipelines, machine learning and Streaming.All these managed services are a boon for hadoop vendors to fulfill their promises in a broader ecosystem.

(Source : http://www.zdnet.com/article/has-the-hadoop-market-turned-a-corner/ )

Medical big data to be pooled for disease research and drug development in Japan. Japantimes.co.jp, May 15, 2017

Medical records of people who are admitted in hospitals and clinics remained untapped because of the difficulty in handling large volumes sensitive data and other privacy concerns .However, the government now thinks that if the data is made anonymous it can be put to better use for medical innovations and drug development. In an attempt to support this a new law referred to as the medical infrastructure law has been passed in Japan that will allow medical big data to be  used for research and development of novel drugs.

(Source : http://www.japantimes.co.jp/news/2017/05/15/reference/medical-big-data-pooled-disease-research-drug-development-japan/#.WRqmHOWGO00 )

Zensar Technologies launches big data enabled platform.EconomicTimes.IndiaTimes.com, May 16, 2017.

Zensar Technologies announced the launch of its new big data platform for effortless end to end information management to address complex business problems through empowered data driven insights.The new big data platform provides data driven decision making culture by integrating the existing assets of an organization with various unstructured data sets.The big data platform has ready to implement business apps suite packed with 30 business applications focussed on BFSI, Manufacturing and Retail industries that help enterprises glean meaningful insights in terms of various business specific KPI’s.

(Source : http://cio.economictimes.indiatimes.com/news/big-data/zensar-technologies-launches-big-data-enabled-platform/58697059 )

Committers Talk Hadoop 3 at Apache Big Data.Datanami.com, May 18, 2017

The release of Apache Hadoop 3 this year will change the way how customers store and process data on clusters.At the recent Apache Big Data Show in Florida, couple of Hadoop project committers from Cloudera shared how Hadoop 3 will have an impact on HDFS and YARN.The major change coming in with Hadoop 3 for the HDFS component is the  addition of erasure coding to gain better storage efficiency. Hadoop 3 will also bring in some notable enhancements to YARN in the form of Docker Containers that will help reduce the dependencies that exist when customers deploy services on Hadoop.

(Source : https://www.datanami.com/2017/05/18/committers-talk-hadoop-3-apache-big-data/ )

The Elephant In The Room With Hadoop: It Offers Rich Technology With Slimmer-Than-Expected Margins. Forbes.com, May 19, 2017.

With the proliferation of big data, many technologies are being deployed  to  support the hybrid infrastructures on public and private clouds. Hadoop is one such popular open source technology framework for managing large cluster of servers.Hadoop is not just a great technology but also a good business. According to Zion Market Research report, global hadoop market is anticipated to grow to $87.14 billion in 2022 with a compound annual growth rate of 50%.Another similar report by Allied Market Research forecasts that Hadoop will generate hardware, software and services revenue of $84.6 billion in 2021.

(Source - https://www.forbes.com/sites/johnsonpierr/2017/05/19/the-elephant-in-the-room-with-hadoop-it-offers-rich-technology-with-slimmer-than-expected-margins/#64dfa202518a )

The siren song of Hadoop.ComputerWorld.com, May 23, 2017.

Apache Hadoop seems to be well-suited for handling machine learning workloads, making it an ideal environment for executing compute intensive machine learning algorithms on large datasets.However, organizations trying to get real data science  work done with Hadoop framework are facing multiple hurdles of because of conflicting technologies, expensive cost structure in the cloud, disconnect among IT and users and multiple approaches to achieve the same goal.The solution to this is identifying an abstraction layer between the raw hadoop layer and data science users.Using this abstraction layer, data scientists can accomplish the work the work they would like to do as the platform identifies the right set of technologies like Hadoop MapReduce, Pig or Spark to accomplish those goals.

(Source : http://www.computerworld.com/article/3196509/data-analytics/the-siren-song-of-hadoop.html )

Cloudera Unveils Altus to Simplify Hadoop in the Cloud.Datanami.com, May 24,2017.

To run hadoop in the cloud or on-premise, requires professionals with specialized skills (data engineers,data analysts or data scientists )  who can configure , manage and maintain hadoop clusters for their clients. Cloudera is trying to eliminate this burden with its new cloud based offering Altus.Altus is all about making hadoop easy for clients as they can now run hadoop suite of tools and applications on public cloud infrastructure.   Using Altus, data engineers who continue working on transient clusters in cloud  can now easily and quickly run their hadoop jobs, spin up jobs and terminate without having to get into hadoop cluster operations and management.

(Source : https://www.datanami.com/2017/05/24/cloudera-unveils-altus-simplify-hadoop-cloud/ )

2017 Salary Report for Big Data and Hadoop Skills.BusinessZone.co.uk, May 25, 2017.

With exponential value addition by big data related technologies like Hadoop and Spark, the demand for  skilled talent with hadoop and spark skills is increasing exponentially year on year. It is not just the big players like Facebook, Google, Microsoft and Amazon hiring big data talent but even the high-end tech startups like Crayon Data, Fractal Analytics and Heckyl are in search of skilled professionals in big data technologies like Hadoop. The average hadoop developer salaries are expected to increase by 3.8% in 2017. The actual salaries will depend on the big data skill set a professional posses and also the level of experience one has in working on big data projects.Among all the IT skills, having hadoop skills on the resume add up to 7% of the total salary.

(Source : http://www.businesszone.co.uk/community/blogs/ethanmillar/2017-salary-report-for-big-data-and-hadoop-skills )

Artificial intelligence on Hadoop: Does it make sense? Zdnet.com, May 26, 2017.

MapR  unveiled  Quick Start Solution (QSS) its novel solution focusing on deep learning applications. QSS is a deep learning product and service offering by the popular hadoop vendor that will  enable the training of compute intensive deep learning algorithms. With this offering is MapR trying bring AI on Hadoop is the question at hand.  Deep learning is only a small part of machine learning which is an integral part of AI.Thus, MapR’s QSS is DL on Hadoop and not exactly AI on Hadoop.The ability to run machine learning or deep learning algorithms on Hadoop does not make a Hadoop vendor an AI vendor.

(Source : http://www.zdnet.com/article/artificial-intelligence-on-hadoop/ )

PREVIOUS

NEXT

Hadoop and Spark Projects

About the Author

ProjectPro

ProjectPro is the only online platform designed to help professionals gain practical, hands-on experience in big data, data engineering, data science, and machine learning related technologies. Having over 270+ reusable project templates in data science and big data with step-by-step walkthroughs,

Meet The Author arrow link