Latest Update made on May 1, 2016.
There is a lot of buzz around big data making the world a better place and the best example to understand this is analysing the uses of big data in healthcare industry. Big data in healthcare is used for reducing cost overhead, curing diseases, improving profits, predicting epidemics and enhancing the quality of human life by preventing deaths. Scientific research labs, hospitals and other medical institutions are leveraging big data analytics to reduce healthcare costs by changing the models of treatment delivery. Here begins the journey through big data in healthcare highlighting the prominently used applications of big data in healthcare industry.
If you would like more information about Big Data careers, please click the orange "Request Info" button on top of this page.
Big Data in Healthcare Industry
The New York based research and consulting firm, Institute for Health Technology Transformation estimates that in 2011, the US Healthcare industry generated 150 billion gigabytes (150 Exabytes) of data. This data was mostly generated by various regulatory requirements, record keeping, compliance and patient care. Since then, there has been an exponential increase in data which has lead to an expenditure of $1.2 trillion towards healthcare data solutions in the Healthcare industry. McKinsey projects that the use of Big Data in healthcare can reduce the healthcare data management expenses by $300 billion -$500 billion.
Big Data in healthcare originates from the large electronic health datasets – these datasets are very difficult to manage with the conventional hardware and software. The use of legacy data management methods and tools also makes it impossible to usefully leverage all this data. Big Data in healthcare is an overpowering concept not just because of the volume of data but also due to the different data types and the pace at which healthcare data management needs to be managed. The sum total of data related to the patient and their well-being constitutes the “Big Data” problem in the healthcare industry.Big Data Analytics has actually become an on the rise and crucial problem in healthcare informatics as well. Healthcare informatics also contributes to the development of Big Data analytic technology by posing novel challenges in terms of data knowledge representation, database design, data querying and clinical decision support.
Learn Hadoop to become a Microsoft Certified Big Data Engineer.
Despite the fact that, most of the data in the health care sector is stored in printed form, the recent trend is moving towards rapid digitization of this data. Big Data in healthcare industry promises to support a diverse range of healthcare data management functions such as population health management, clinical decision support and disease surveillance. The Healthcare industry is still in the early stages of getting its feet wet in the large scale integration and analysis of big data.
With 80% of the healthcare data being unstructured, it is a challenge for the healthcare industry to make sense of all this data and leverage it effectively for Clinical operations, Medical research, and Treatment courses.
The volume of Big data in healthcare is anticipated to grow over the coming years and the healthcare industry is anticipated to grow with changing healthcare reimbursement models thus posing critical challenges to the healthcare environment. Even though, profit is not the sole motivator, it is extremely important for the big data healthcare companies to make use of the best in class techniques and tools that can leverage Big Data in healthcare effectively. Else these big data healthcare companies might have to skate on thin ice when it comes to generating profitable revenue.
For the complete list of big data companies and their salaries- CLICK HERE
Need of Hadoop in Healthcare Data Solutions
Charles Boicey an Information Solutions Architect at UCI says that “Hadoop is the only technology that allows healthcare to store data in its native form. If Hadoop didn’t exist we would still have to make decisions about what can come into our data warehouse or the electronic medical record (and what cannot). Now we can bring everything into Hadoop, regardless of data format or speed of ingest. If I find a new data source, I can start storing it the day that I learn about it. We leave no data behind.”
By the end of 2016, the number of health records of millions of people is likely to increase into tens of billions. Thus, the computing technology and infrastructure must be able to render a cost efficient implementation of:
- Parallel Data Processing that is unconstrained.
- Provide storage for billions and trillions of unstructured data sets.
- Fault tolerance along with high avaiability of the system.
Hadoop technology is successful in meeting the above challenges faced by the healthcare industry as MapReduce engine and HDFS have the capability to process thousands of terabytes of data. Hadoop makes use of cheap commodity hardware making it a pocket friendly investment for the healthcare industry.
Learn Big Data and Hadoop Online to join the top Big Data Healthcare Companies!
Here are 5 healthcare data solutions of Big Data and Hadoop–
1. Hadoop technology in Cancer Treatments and Genomics
Deepak Singh, the principal product manager at Amazon Web Services, said, “We’ve definitely seen an uptake in adopting Hadoop in the life sciences community, mostly targeting next-generation sequencing, and simple read mapping because what developers discovered was that a number of bioinformatics problems transferred very well to Hadoop, especially at scale.”
Image Credit: mobilehealthglobal.com
Industry reports indicate that, there are about 3 billion base pairs that constitute the human DNA and it is necessary for such large amounts of data to be organized in an effective manner if we have to fight cancer. The biggest reason why cancer has not been cured yet is because of the fact that cancer mutates in different patterns and reacts in different ways based on the genetic makeup of an individual. Hence, oncology researchers have come up with a solution that in order to cure cancer, patients will need to be given personalized treatment based on the type of cancer the individual patient’s genetics make up. Leveraging Hadoop technology will offer great support for parallelization and help in mapping the 3 billion DNA base pairs using MapReduce programs.
Ketan Paranjape, the global director of health and life sciences at Intel, talks about his efforts to build on those investments as he discusses the current state and future directions in health care analytics. The goal of using Hadoop in Healthcare, Paranjape says, is to collect and analyze data that can do everything from assess public health trends in a region of millions of people to pinpoint treatment options for one cancer patient.
David Cameron, Prime minister of UK has announced a government funding of £300m in August, 2014 for a 4 year project that will target to map 100,000 human genomes by the end of 2017 in collaboration with the American Biotechnology firm Illumina and Genomics England. The main goal of this project is to make use of big data in healthcare to develop personalized medication for cancer patients.
CASI pr the Complex Adaptive Systems Initiative at the Arizona State University is developing a genomic data lake with petabytes of genetic data on individuals, treatments, potentially helping in identifying the cancer gene and providing the base to develop life saving cancer treatments through big data analysis.
Let’s take a look at how big and complicated genomics data can get and how Hadoop solves this problem. We’ll consider a drug for cancer - that has been declared as 40% effective in fighting the deadly disease. That could mean a number of things. It may mean that for patients with a certain genetic profile or area - the drug is 100% effective. But it might also mean that patients who do not have the suitable genetic profile or are not from health conducive environments are not responding to the drug at all. For them, the drug will show a 0% effective rate.
The reason why healthcare data is so complex is because a single genome in a human has 20,000 different genes. Now suppose we store this data in traditional database, and combine each of these genomes with 1 mn variable DNA, then that would mean - for each person there would be 20 billion rows of data. Legacy systems are just not equipped to deal with this veracity of big data.
2. Hadoop technology in Monitoring Patient Vitals
There are several hospitals across the world that use Hadoop to help the hospital staff work efficiently with Big Data. Without Hadoop, most patient care systems could not even imagine working with unstructured data for analysis.
Image Credit: slideshare.net
Children’s Healthcare of Atlanta treats over 6,200 children in their ICU units. On average, the duration of stay in Pediatric ICU varies from a month to a year. Children’s Healthcare of Atlanta used a sensor beside the bed that helps them continuously track patient signs such as blood pressure, heartbeat and the respiratory rate. These sensors produce large chunks of data, which using legacy systems cannot be stored for more than 3 days for analysis.The main motive of Children’s Healthcare of Atlanta was to store and analyze the vital signs. If there is any change in pattern, then the hospital wanted an alert to be generated to a team of doctors and assistants. All this was successfully achieved using Hadoop ecosystem components - Hive, Flume, Sqoop, Spark, and Impala.
3. Hadoop technology in the Hospital Network
A Cleveland Clinic spinoff company known as Explorys is making use of Big Data in healthcare to provide the best clinical support, reduce the cost of care measurement and manage the population of at-risk patients. Explorys has reportedly built the largest database in the healthcare industry with over a hundred billion data points all thanks to Hadoop.
Explorys uses Hadoop technology to help their medical experts analyze data bombardments in real time from diverse sources such as financial data, payroll data, and electronic health records.
The analytics tool developed by Explorys is used for data mining so that it helps clinicians determine the deviations among patients and the effects treatments have on their health. These insights help the medical practitioners and health care providers find out the best treatment plans for a set of patient populations or for an individual patient.
4. Hadoop technology in Healthcare Intelligence
Healthcare Insurance Business operates by collating the associated costs (the risk) and equally dividing it by the number of members in the risk group. In such circumstances, the data and the outcomes are always dynamic and changing.Using Hadoop technology in Healthcare Intelligence applications helps hospitals, payers and healthcare agencies increase their competitive advantages by devising smart business solutions.
For instance, let’s assume that, a healthcare insurance company is interested in finding the age in a particular region where individuals below that age are not victims of certain diseases. This data will help the insurer compute the cost of insurance policy. To gather desired age, insurance companies will have to process huge data sets to extract meaningful information such asmedicines, diseases, symptoms, opinions, geographic region detail etc. In this scenario, using Hadoop’s Pig, Hive and MapReduce is the best solution to process such large datasets.
Sunil Kakre Director of IT, DignityHealth, spoke at a recent Hadoop Summit about their journey for moving healthcare analytics to Hadoop. DignityHealth is one the leading healthcare providers in US. They started their journey a year back - of moving to Hadoop. As Hadoop is constantly evolving and becoming more mature - it is helping in eliminating the challenges faced by the Heathcare industry while using legacy systems. The data at Healthcare industry is varied and unpredictable. There is a tremendous amount of pressure on the business - as many things keep changing like policies, regulations, etc. There is a need for a robust tool which has the analytical capability to analyse this ever changing, morphing data. This is where Hadoop applications come in.
Let’s take an example. Over a million people get affected by Sepsis condition in the US. Nearly 28 - 50% of the people affected by this condition die. This number is higher than the total number of people dying from prostrate cancer, breast cancer and AIDS combined. This example is taken because the condition is time sensitive The sooner you analyse and react - the more lives you can save. Imagine if you can analyse how many hospitalizations happen for this condition and how many deaths result from this condition, what is the time lag in death resulting from the condition and cure. Imagine if you can build an analytics around the Sepsis condition and build an exploratory or intelligence tool that can predict the number of people affected by Sepsis who can still be cured - you can save a life.
DignityHealth processes about 30+ terabytes of data from their 40+ hospitals and multiple healthcare systems. But the data is stored in Silos. The need is to bring this data in one place - so that it can be analysed all together to solve a common disease. This is a great opportunity for Hadoop applications to really make a difference.
5. Hadoop technology in Fraud Prevention and Detection
At least 10% of the Healthcare insurance payments are attributed to fraudulent claims. Worldwide this is estimated to be a multi billion dollar problem. Fraudulent claims is not a novel problem but the complexity of the insurance frauds seems to be increasing exponentially making it difficult for the healthcare insurance companies to deal with them.
Image Credit: ibmbigdatahub.com
Big Data Analytics helps healthcare insurance companies find different ways to identify and prevent fraud at an early stage. Using Hadoop technology, insurance companies have been successful in developing predictive models to identify fraudsters by making use of real-time and historical data of medical claims, weather data, wages, voice recordings, demographics, cost of attorneys and call center notes. Hadoop’s capability to store large unstructured data sets in NoSQL databases and using MapReduce to analyze this data helps in the analysis and detection of patterns in the field of Fraud Detection.
The upswing for big data in healthcare industry is due to the falling cost of storage. As early as 5 years ago, the cost of a scalable relational database with a permanent software license was $100,000 per TB along with an additional cost of $20,000per year for support and maintenance. Now with the advent of Hadoop in Big Data Analytics it is possible to store, manage and analyze the same amount of data with a yearly subscription of just $1,200. The increasing demand for using Hadoop technology in Healthcare will eliminate the concept of “one size fits all” kind of medicines and treatments in the healthcare industry. The coming years will see the Healthcare industry provide personalized patient medications at controlled costs.
Did you like our top 5 healthcare data solutions of Big Data? If you work in the healthcare industry or have an idea of any other healthcare data solutions that help big data healthcare companies harness the power of Hadoop, please leave a comment below!
Get IBM Hadoop Certification to have an edge over your peers!