Abstract

(1) Background: This work investigates whether and how researcher-physicians can be supported in their knowledge discovery process by employing Automated Machine Learning (AutoML). (2) Methods: We take a design science research approach and select the Tree-based Pipeline Optimization Tool (TPOT) as the AutoML method based on a benchmark test and requirements from researcher-physicians. We then integrate TPOT into two artefacts: a web application and a notebook. We evaluate these artefacts with researcher-physicians to examine which approach suits researcher-physicians best. Both artefacts have a similar workflow, but different user interfaces because of a conflict in requirements. (3) Results: Artefact A, a web application, was perceived as better for uploading a dataset and comparing results. Artefact B, a Jupyter notebook, was perceived as better regarding the workflow and being in control of model construction. (4) Conclusions: Thus, a hybrid artefact would be best for researcher-physicians. However, both artefacts missed model explainability and an explanation of variable importance for their created models. Hence, deployment of AutoML technologies in healthcare remains currently limited to the exploratory data analysis phase.

Highlights

  • After reviewing the literature on the state of data analysis in healthcare, it becomes evident that there is still significant progress to be made in the application of analytics in healthcare [1,2]

  • Automation of the knowledge discovery process can increase the adoption of analytics by enabling domain experts to contribute to the knowledge discovery in the field using state-of the art techniques in adaptive analytic systems, especially with the amount of data that is available in healthcare

  • We evaluated the capabilities of the Automated Machine Learning (AutoML) methods to the requirements of the domain experts

Read more

Summary

Introduction

After reviewing the literature on the state of data analysis in healthcare, it becomes evident that there is still significant progress to be made in the application of analytics in healthcare [1,2]. Automation of the knowledge discovery process can increase the adoption of analytics by enabling domain experts to contribute to the knowledge discovery in the field using state-of the art techniques in adaptive analytic systems, especially with the amount of data that is available in healthcare. The introduction of the Electronic Health Record (EHR) accelerated the digitalisation of this formerly unavailable dataset. Combining this available EHR data with analytics provides great potential to improve the healthcare industry by making the data available for automatic processing [3]. Bridging purely foundational and purely applied research processes, both applied data science [6] and self-service data science studies pursue a translational (e.g., application-oriented) research process, often including CRISP-DM as the knowledge discovery process of choice [7]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.