Abstract

Big Data technologies have a great potential in transforming healthcare, as they have revolutionized other industries. In addition to reducing the cost, they could save millions of lives and improve patient outcomes. Heart Failure (HF) is the leading death cause disease, both nationally and internally. The Social and individual burden of this disease can be reduced by its early detection. However, the signs and symptoms of HF in the early stages are not clear, so it is relatively difficult to prevent or predict it. The main objective of this research is to propose a model to predict patients with HF using a multi-structure dataset integrated from various resources. The underpinning of our proposed model relies on studying the current analytical techniques that support heart failure prediction, and then build an integrated model based on Big Data technologies using WEKA analytics tool. To achieve this, we extracted different important factors of heart failure from King Saud Medical City (KSUMC) system, Saudi Arabia, which are available in structured, semi-structured and unstructured format. Unfortunately, a lot of information is buried in unstructured data format. We applied some pre-processing techniques to enhance the parameters and integrate different data sources in Hadoop Distributed File System (HDFS) using distributed-WEKA-spark package. Then, we applied data-mining algorithms to discover patterns in the dataset to predict heart risks and causes. Finally, the analyzed report is stored and distributed to get the insight needed from the prediction. Our proposed model achieved an accuracy and Area under the Curve (AUC) of 93.75% and 94.3%, respectively.

Highlights

  • In the recent years, a new hype has been introduced into the information technology field called „Big Data‟

  • We collaborated with King Saud University Medical City (KSUMC) system located in Riyadh, Saudi Arabia to extract manually all needed clinical and demographic that we needed to adapt to evaluate the performance of the proposed model in identifying Heart Failure (HF) risk, from January 2015 to December 2015

  • We can note that logistic regression did great in the integrated models compared to its poor performance in the single dataset models with over 90% recall, which can be resulted from the nature of the algorithm as it predicts better for problems with many attributes

Read more

Summary

Introduction

A new hype has been introduced into the information technology field called „Big Data‟. Using Big Data analytics, organizations can extract information out of massive, complex, interconnected, and varied datasets (both structured and unstructured) leading to valuable insights. Analytics can be done on big data using a new class of technologies that includes Hadoop [2], R [3], and Weka [4] These technologies form the core of an open source software framework that supports the processing of huge datasets. The report points to the healthcare sector as a potential field where valuable insights are buried in structured, unstructured, or highly varied data sources that can be leveraged through Big Data analytics. The report predicts that if U.S healthcare could use big data effectively, the hidden value from data in the sector could reach more than 300$ billion every year. According to the „Big Data cure‟ published last March by MeriTalk [6], 59% of federal executives working in healthcare agencies indicated that their core mission would depend on Big Data within 5 years

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.