Microsoft and Hortonworks Inc. product Azure HDInsight is a managed Hadoop service that gives users access to deploy and manage hadoop clusters on the Azure Cloud. Microsoft has upgraded this cloud based platform with new security enhancements and a performance boost that the company states will speed up Big Data queries 25x. The key points on the security enhancements are enhanced authentication and identity management features.The performance boost comes through a novel feature called Long Lived And Process (LLAP) used in Hive databases.
7000 people attended the Strata + Hadoop World conference that happened in New York last week to showcase the emerging trends in big data technologies.The big data conference was honored by White House chief data scientist DJ Patil who elucidated his vision on where machine learning, analytics, the Internet of Things, autonomous vehicles and smart cities will be taking us in the near future.
If you would like more information about Big Data Training, please click the orange "Request Info" button on top of this page.
According to a recent report from Forrester titled Big Data Management Solutions Forecast 2016 to 2021,Hadoop and NoSQL will see a major growth.With the markets growing 25.0% and 32.9% every year respectively.Forrester analysts predict that the big data technology space is likely to witness 3 times the overall technology market growth in the next five years.
For the complete list of big data companies and their salaries- CLICK HERE
Speed and quality are the main concerns for extracting value from data. Accuracy is delivered by quality and speed is delivered by relevance. A benchmark study conducted RedPoint Global, the leading provider of data management and customer engagement technology revealed that it exceeded the previous benchmarks in terms of data quality, speed, usability and maturity. The data management platform completed the same workload 1900 times faster than Hadoop MapReduce based Hive or Tez approach and 500 times faster than on Spark.
Various fields like biology, astronomy and physics are making use of Big Data for better results. However, social sciences has been one of the isolated areas.The University of Essex teamed up with Sage Publishing to produce a Sage white paper titled - “Who Is Doing Computational Social Science?: Trends in Big Data Research.” Around 9412 social scientists participated in the survey with around 3302 from US, 405 from India, 728 from UK and 353 from Canada. The report revealed that 33% have already been involved in big data research while 49% were definitely planning to do so in future.
The benchmarks released by Cloudera vs Hortonworks for SQL-on-Hadoop engines show that SQL is still stable for hadoop platforms. Programmers can model customer ecosystem as social graphs, perform sentiment analysis, do machine learning and run streaming.However, the foremost question that companies have when using SQL-on-Hadoop is that how fast is the interactive SQL.Considering the appeal hadoop has amongst Python and R developers, using it only for SQL not might be a smart move.Benchmarks released by Cloudera vs Hortonworks prove that how important SQL is getting for companies adopting Hadoop.
(Source : http://www.zdnet.com/article/sql-on-hadoop-benchmarks-get-serious/)
The primary target for Hadoop commodity clusters is offloading work from a high-cost data warehouse.A new Hadoop tool from Cloudera Inc. called Navigator Optimizer can allows programmers to look at ETL queries running on other platforms and see how they would act in Hadoop environment.The new tool can help hadoop developers analyse how well queries perform in Impala or Hive.
A recent survey by DNV GL - Business Assurance and GFK Eurisko predicts that 76% of the organisations plan to invest more on big data and 52% of the organisations consider Big Data as an opportunity. In accordance with this survey, there is sparkling interest for MuleSoft’s ecosystem in big data which it is further planning to support with Anypoint Connector for Hadoop (HDFS) v5.0.0.
Hadoop ecosystem has improved over the years to become enterprise ready, and it growing on the maturity scale. Hadoop can build the strength of a true enterprise ready platform by focussing on two major initiatives - laser focus on data lineage and data quality.
OLAP is usually done on smaller datasets in traditional and legacy systems, however, now OLAP workloads are being transformed to data lakes that run Apache Hadoop and Spark. This is possible with the innovation of novel Apache projects like Apache Kylin, Apache Druid and Apache Lens.
(Source: https://dzone.com/articles/olap-for-big-data )
Looker, a database analytics engine that helps data analysts curate complex Hadoop data sets.Data analysts can easily build a data model for all the data present in hadoop by transforming it into valuable metrics so that stakeholders and business users can explore the historical data stored in Hadoop.