Abstract

In recent years, Data Mining techniques such as classification, clustering, association, regression etc. are widely used in healthcare field to help analyzing and predicting disease and improving the quality and efficiency of medical services. This paper presents a web-based platform for big data analysis of healthcare using Data Mining techniques. The platform consists of three main layers: Apache Spark Layer, Workflow Layer and Web Service Layer. Apache Spark Layer provides basic Apache Spark functionalities as regular Resilient Distributed Datasets (RDD) operations. Meanwhile, this layer provides a cache mechanism to maximize the use of the results as much as possible which were calculated before. Workflow Layer encapsulates a variety of nodes for Data Mining, which have different roles such as data source, algorithm model or evaluation tool. These nodes can be organized into a workflow which is a directed acyclic graph (DAG), and then it will be submitted to Apache Spark Layer to execute. And we have implemented many models including Naive Bayes model, Decision Tree model and Logistic Regression model etc. for healthcare big data. Web Service Layer implements rich restful API including data uploading, workflow composition and analysis task submission. We also provide a web graphical interface for the user. Through the interface users can achieve efficient Data Mining without any programming which can greatly help the medical staff who don’t understand programming to diagnose the patients’ condition more accurately and efficiently.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.