News on Hadoop - April 2018
Big Data and Cambridge Analytica: 5 Big Picture Truths.Datamation.com, April 2, 2018.
Cambridge Analytica is in the news headlines as people realize the fact that how Trump campaign and its donors used FB ads and created a sociopolitical shakeup.Let’s understand what big truths does the #deletefacebook movement reveal and what implications it has -
i) This data leak is not due to any breach but is a quite common and nothing unusual. In this case, a big data company has collected the data and used it to persuade customers Businesses can buy data about your job history, ethnicity, reading habits, number of vehicles you own and anything for that matter of fact.
ii) Marketers are always keen on scooping up data from their consumers about the their preferences and consumers are also jus too eager to share without bothering about their privacy giving big data miners an upperhand for the foreseaable future.
iii) Big data has produced a cultural shift towards data-driven decision making. In future, big data will not just produce headline, but will produce greater results.
iv) Big data might have got a bad press after the Cambridge Analytica scandal but there is great potential for uplift leveraging big data.
v) Big data now rules our world and it is not possible for businesses to compete without big data.
(Source : https://www.datamation.com/big-data/big-data-and-cambridge-analytica-5-big-picture-truths.html)
$11.45 Bn Big Data in Healthcare Market, 2025.PRNewsWire.com, April 3, 2018.
The global big data in healthcare amounted to $11.45 billion in 2016 and is anticipated to see a double digit growth throughout 2017-2025. Healthcare data accounted for over 700 exabytes in 2017 and is projected to grow to 2314 exabytes by 2020.Organizations are leveraging various analytical tools and AI techniques on this increasing volume of healthcare data to glean data-driven insights which can help reduce healthcare costs, enhance revenue streams , develop personalized medicine, and manage proactive patient care.Analytics services is the fastest growing segment with a dominant share of %5.80 billion in 2017.
(Source : https://www.prnewswire.com/news-releases/1145-bn-big-data-in-healthcare-market-2025-300623544.html )
How McDonald's Is Getting Ready For The 4th Industrial Revolution Using AI, Big Data And Robotics. Forbes.com, April 4, 2018.
Operating in 188 countries and serving more than 69 million, the fast-food burger joint McDonald’s creates huge volumes of data. McDonald’s is leveraging AI, Big Data and Robotics to keep costs low and efficiencies high in the following ways -
i) Enhanced Customer Experience -McDonald’s gets customer intelligence about where and when a customer goes to a restaurant, how often they go, if they go into the restaurant or use the drive thru, and what food they purchase. This helps McDonald’s suggest complementary products and send personalized offers to increase sales when customer use the mobile app.
ii) McDonald’s has rolled out new digital menus which change based on the real-time analysis of data. The digital menu options change based on time of the day and current weather.
iii)McDonald’s leverages big data to embrace a data-driven culture and better understand performance at each individual restaurant.
(Source : https://www.forbes.com/sites/bernardmarr/2018/04/04/how-mcdonalds-is-getting-ready-for-the-4th-industrial-revolution-using-ai-big-data-and-robotics/#2fde7e243d33 )
Facebook is offering a $40,000 bounty if you find the next Cambridge Analytica.CNBC.com, April 10, 2018
Facebook will pay $40,000 and above to people who can catch large big data leaks. The company has announced the launch of a bounty program wherein it would reward people who can discover cases of data abuse on its platform starting from $500 to more than $40,000.The reward money would depend on how large the discovery is . This is the first of its kind data abuse program in the industry.As of now, Facebook has 10 people on the bug bounty team but plans to hire more people to investigate any substantiated claims. Facebook pays over $1 million on average every year in big bounties.
(Source : https://www.cnbc.com/2018/04/10/facebook-will-pay-up-to-40000-if-you-find-a-big-data-leak.html )
Big data analysis accurately predicts patient survival from heart failure.Yale.edu, April 12, 2018
Heart failure is the major cause of death and disability in US costing healthcare systems more than $30 billion per annum.The research team at Yale’s Section of Cardiovascular Medicine analyzed data of more than 40,000 patients using statistical machine learning techniques and predicted the results for the patients after a year of diagnosis. They also applied clustering techniques to classify patients into 4 identifiable categories with diverse responses to most common medicines.These big data methods outperformed the existing measures of heart failure and had better prediction of risk over earlier published models.
(Source : https://news.yale.edu/2018/04/12/big-data-analysis-accurately-predicts-patient-survival-heart-failure )
BlueData Partners with Computacenter to Provide Big-Data-as-a-Service in Germany.GlobalBankingandFinance.com, April 13, 2018.
The provider of leading Big-Data-as-a-Service(BDaaS) software platform -BlueData has announced its collaboration with Computacenter Germany. This agreement will allow the IT infrastructure services provider to offer BlueData EPIC software to its customers in the German market.BlueData has gained popularity for its innovations using Docker containers that streamline the provisioning of various big data workloads in an elastic, on-demand, and multi-tenant architecture. BlueData’s EPIC platform will help customers of Computacenter to enhance agility and reduce costs for their overall big data infrastructure by providing them with the ability to spin up instan clusters for Hadoop, Spark and other open source big data frameworks whilst ensuring enterprise-class security and performance.
(Source : https://www.globalbankingandfinance.com/bluedata-partners-with-computacenter-to-provide-big-data-as-a-service-in-germany/ )
Zoomlion using Cloudera to boost big data platform.Telecomasia.net, April 13, 2018.
Zoomlion, the chinese construction machinery and sanitation equipment manufacturer adopted Cloudera’s big data platform to serve its growing big data needs. Zoomlion will use Cloudera’s big data platform to provide data management and analytic services to its customers in over 100 countries across 6 continents. Zoomlion collects and processes data from 2 important sources - IoT data, real-time working conditions data and location information from 120,000 high-tech , industrial and agricultural machines. The company uses Cloudera’s big data platform to help customers reduce their operating cost, optimize their own operational management capabilities and enhance efficiency of equipment management.
(Source : https://www.telecomasia.net/content/zoomlion-using-cloudera-boost-big-data-platform )
Netflix Used Big Data To Identify The Movies That Are Too Scary To Finish.Forbes.com April 18,2018.
With more than 100 million users across the globe producing extraordinary amount of data to analyze , Netflix glean valuable insights from the data about its viewers to drive success.Netflix has used big data to identify a list of films which are scary that viewers will not finish them. They found that if a user has watched at least 70% of the movie ( a data point that the streaming service can calculate) but after that they turned it off is because the movie is too horrifying to watch. However, the question is that a viewer could also turn off the movie if they did not like but the data helped them find that a viewer would turn off the movie well before the 70% threshold if they disliked the movie.
(Source : https://www.forbes.com/sites/bernardmarr/2018/04/18/netflix-used-big-data-to-identify-the-movies-that-are-too-scary-to-finish/#471296673990)
DataWorks 18: Hortonworks styles itself 3.0 with a ‘DataPlane’ service.ComputerWeekly.com, April 18, 2018.
Hortonworks announced the launch of a data governance plug-in named Studio for its DataPlane platform at the recent Hadoop Summit.DataPlane platform that was announced last year provides data architecture as a service with data governance baked in Apache Atlas.Scott Gnau, chief technology officer at Hortonworks said that the supplier is at stage 3.0 where plain Hadoop was at 1.0 and 2.0 was all about data flow.The recent upgrade to its DataFlow product was named HDF3.0 that incorporated the streaming analytics manager tool.
(Source : https://www.computerweekly.com/news/252439269/DataWorks-18-Hortonworks-styles-itself-30-with-a-DataPlane-service )
Big Data as a Service Market Set to Grow at CAGR of 60.9% Between 2016-2020.ChicagoEveningPost.com, April 19, 2018.
The increase in IT consumerization has led technology to create a new platform for financial applications that will help banks to gather, consolidate , and consume information in a way that will change the way customers interact through mobile apps and websites.Financial Services are on the positive stage of big data adoption. Many of the financial services today depend on enhancing the traditional data infrastructure and tackle various issues such as workforce mobility, customer data management, risk, and multichannel effectiveness. These consistent problems are compelling financial institutions to deploy big data services leading to a drastic growth in the BDaaS market. The global big data as a service market is anticipated to grow at a compound annual growth rate of 60.9% over the period 2016-2020.Global Big-data as a Service Market 2016-2020 prepared by marketsresearchreports.biz is based on an in-depth market analysis with inputs from industry experts.
(Source : http://chicagoeveningpost.com/2018/04/19/big-data-as-a-service-market-set-to-grow-at-cagr-of-60-9-between-2016-2020/ )
Audi puts open source big data foundations in place for car usage data.ComputerWorlduk.com, April 26, 2018.
Audi uses diverse open source big data technologies for collecting large volumes of data from its novel luxury car models and machinery being used at its production facilities.Audi is a big hadoop user with a hadoop cluster of 1PB storage capacity, 288 cores spread across 12 nodes and 6TB of RAM. It also has a Kafka cluster with 4 nodes, 128 GB of RAM and 16 TB of raw capacity. The big data tools adoption has led to two proof of concepts- one from screwdrivers used at production facilities and the other is the car usage data being transmitted from the control units. Using the car data, they could find that every new Audi model transmits approximately 25,000 signals over the air into the HDFS store which is used by the staff for analysis.This analysis is then layered using data visualization tools like Tableau to provide business users access to the information that helps better design decision making on any new upcoming Audi models. Everything today that is built at Audi is on-premise or in their private cloud. Audi has now turned to a third party vendor Confluent for HDFS connector that will bring data through Kafka pipelines into HDFS ,along with a metadata catalogue built in the cloud for locating data.
(Source -https://www.computerworlduk.com/data/audi-puts-open-source-big-data-foundations-in-place-for-car-usage-data-3676227/ )
Africa has the potential to be a leader in open source distributed computing, says Standard Bank Hadoop expert.Computing.co.uk, April 26, 2018.
Data security has always been an increasing concern for South Africa’s Standard Bank as the data volume continues to grow inexorably.Just like in other countries across the glove, finding a home for archived data in the cloud is not yet an option for Standard Bank.The need for a central store to store the increasing volume of data led the bank to adopt hadoop. Hadoop based storage and analytics system provides real-time responsiveness across business use cases from marketing to fraud detection. The bank wants to adopt as many open source softwares as possible to cut down on costs and contribute to the open source community, and garner support in return from the community. They are about 90% open source but the major roadblock for the bank is shortage of skills in data science.Data science talent is non-existent in South Africa. You need to let people from computer science background come and try their approach on doing data science.The only way to lower down this skills shortage is to train people internally in data science and other big data tools such as Hadoop and Spark.
(Source - https://www.computing.co.uk/ctg/news/3031136/africa-has-the-potential-to-be-a-leader-in-open-source-distributed-computing-says-standard-bank-hadoop-expert )