There are some tech buzzwords like SAP that have been more predominant than “Big Data”. Companies can analyse structured big data in real time with in-memory technology. SAP can be used for detailed simulations, real time reports and queries on mobile end devices.BI solutions with in-memory technology store data in the working memory instead of the hard drive making it easier for processing, evaluation and use. SAP alone is not enough for this. SAP is all set to ensure that big data market knows its hip to the trend with its new announcement at a conference in San Francisco that it will embrace Hadoop. What follows is an elaborate explanation on how SAP and Hadoop together can bring in novel big data solutions to the enterprise.
“Adoption is the only option. Hadoop is Changing Our World and Changing Yours”-Mike Gualtieri, Principal Analyst at Forrester.
“SAP systems hold vast amounts of valuable business data -- and there is a need to enrich this, bring context to it, using the kinds of data that is being stored in Hadoop. SAP has long seen Hadoop and NoSQL as integral to a complete and unified Big Data solution and it has worked -- and continues to work -- to treat Hadoop as a valuable part of an extension to its solutions.”-Irfan Khan, CTO for SAP Global Customer Operations
SAP community expressed their interest in the increasing number of Hadoop deployments in the enterprise. The maximum value of big data can be extracted by integrating the in-memory processing capabilities of SAP HANA (High Performance Analytic Appliance) and the ability of Hadoop to store large unstructured datasets. An organization can extract value from each and every source of data to discover meaningful hidden insights that can pave way for novel revenue generating opportunities.
“With Big Data, you’re getting into streaming data and Hadoop. They’re pushing Hana to begin to embrace more of those scenarios.”- Henry Morris, senior VP with IDC
SAP is considering Apache Hadoop as large scale data storage container for the Internet of Things (IoT) deployments and all other application deployments where data collection and processing requirements are distributed geographically. SAP has announced a deeper embrace of the big data platform Hadoop. SAP intends to develop a deeper integration with Apache Hadoop by using Apache Spark as the data filtering mechanism.Apache Spark can be used as in-memory analysis and data streaming platform (intelligent processing engine) for speeded up data access in Hadoop.
For example, let’s consider the IoT deployments of Oil and Gas Utility sector, various types of sensors provide readings at a high scale. Under such circumstances Apache Hadoop will provide low-cost data storage for huge volumes of sensor data. Data generated from sensors does not have any specific format and thus Apache Spark will be used to filter, alter and blend the data from sensors such only crucial sensor data signals will be sent to the IoT applications running on the SAP HANA cloud platform.
“SAP's latest release of HANA offers Big Data capabilities”- June 16, 2015, Firstpost
SAP recently announced the release of SAP HANA’s latest version SPS10 geared towards offering big data capabilities. This new version of HANA focusses on mission critical applications by processing big data and connecting the Internet of Things (IoT) to help enterprises climb a step ahead in innovating next gen big data applications. Enterprises can continue to harness the power of big data by exploiting the novel data integration capabilities of SAP HANA and the latest enterprise Hadoop distributions provided by Hortonworks or Cloudera. The new HANA version SPS10 has a user interface that uses Apache Ambari for combining SAP HANA and Hadoop cluster administration and it also uses Apache Spark SQL for fast data transfer.
SAP has also partnered with Cloudera to provide solutions that work together with SAP HANA and Apache Hadoop. The main motive of SAP to embrace Hadoop is having easy connectivity to data, regardless of the fact that it is from the SAP software or from any other vendor.
Large organizations that run SAP Analytics through SAP BI Platform, SAP Predictive Analytics and SAP Lumira can directly connect to an enterprise data hub based on Apache Hadoop to store huge volumes of data reliably and cost effectively –via a direct connection to Cloudera Impala, the most interactive and leading data analytics database for Apache Hadoop.
How SAP Hadoop work together?
Image Credit : timoelliot.com
Enterprises that want to capture data from various sources at minimal cost and leverage it for analytics along with the real time information from ERP systems should combine SAP and Apache Hadoop to achieve best outcomes. The business information is physically stored in memory for SAP HANA. Hadoop supports huge volumes of unstructured data such as data generated from sensors, Facebook updates, Twitter Feeds, etc. By combining the two, big data applications can leverage “Smart Data Access” by virtually accessing data from SAP HANA on Hadoop data.
- SAP Data services can interact with Apache Hadoop and SAP HANA through Pig and Hive to gain insights from data.
- Apache Hadoop Ecosystem can be leveraged in diverse way by using SAP HANA platform by integrating the power of SAP HANA’s in-memory capabilities and Hadoop’s distributed processing and mass data storage capabilities.
For the complete list of big data companies and their salaries- CLICK HERE
SAP HANA vs HADOOP
SAP and Hadoop converge to make a happy couple because the challenging part of using Hadoop is extracting information from the large datasets in real time and SAP HANA has in-memory capabilities for processing large datasets in real time, making them a perfect match for each other.
SAP and Hadoop need not work on the same system to render customer value. Data collected from various sources is uploaded into Apache Hadoop that acts as a data warehouse for storing huge amounts of unstructured data. The datasets can then be extracted from Apache Hadoop into SAP HANA as it has in-built Hadoop connectivity. SAP HANA Hadoop combination helps decision makers in an enterprise to run any kind of analytics report in SAP HANA.
The combined potential of Apache Hadoop’s parallel processing of large datasets and HANA’s in-memory computing capabilities offers-
- Cost effective solutions for large scale data storage and processing of both structured , semi structured and unstructured data such as text, video,audio,web logs, and machine data.
- Helps data mining of raw data that has dynamic schema (schema changes over time).
- Allows processing of heavily recursive,machine learning algorithms and execution of queries which cannot be expressed easily with SQL.
- Complex information processing is easier and faster.
- SAP HANA combined with power of Apache Hadoop lowers the overall cost of big data analytics.
- Organizations can cut down on the overall licensing costs of the software whilst leveraging accelerated analysis. Apache Hadoop is the mediator between SAP HANA and the primary sources of data requiring less memory.
There are several big data vendors announcing their intent to render support for big data and IoT applications, however official announcement by SAP to embrace Hadoop and support IoT deployments is an intelligent business decision – as the SAP community has thought seriously on the bottlenecks that are likely to be encountered in big data and IoT deployments in future.
SAP’s official announcement to continue with enterprise adoption of Hadoop will definitely bring in advanced and high-scale real-time big data applications. “SAP embraces Hadoop” is an indication geared towards ground-breaking and interesting big data deployments over the years to come.
Click here to know more about IBM Certified Hadoop Big Data Online Training