Interpretability of Machine Learning Solutions in Public Healthcare: The CRISP-ML Approach.

Inna Kolyshkina,Simeon Simoff

doi:10.3389/fdata.2021.660206

Inna Kolyshkina, Simeon Simoff

Open Access

https://doi.org/10.3389/fdata.2021.660206

Copy DOI

Journal: Frontiers in big data	Publication Date: May 26, 2021
Citations: 21	License type: CC BY 4.0

Affiliation: Western Sydney University

Abstract

Public healthcare has a history of cautious adoption for artificial intelligence (AI) systems. The rapid growth of data collection and linking capabilities combined with the increasing diversity of the data-driven AI techniques, including machine learning (ML), has brought both ubiquitous opportunities for data analytics projects and increased demands for the regulation and accountability of the outcomes of these projects. As a result, the area of interpretability and explainability of ML is gaining significant research momentum. While there has been some progress in the development of ML methods, the methodological side has shown limited progress. This limits the practicality of using ML in the health domain: the issues with explaining the outcomes of ML algorithms to medical practitioners and policy makers in public health has been a recognized obstacle to the broader adoption of data science approaches in this domain. This study builds on the earlier work which introduced CRISP-ML, a methodology that determines the interpretability level required by stakeholders for a successful real-world solution and then helps in achieving it. CRISP-ML was built on the strengths of CRISP-DM, addressing the gaps in handling interpretability. Its application in the Public Healthcare sector follows its successful deployment in a number of recent real-world projects across several industries and fields, including credit risk, insurance, utilities, and sport. This study elaborates on the CRISP-ML methodology on the determination, measurement, and achievement of the necessary level of interpretability of ML solutions in the Public Healthcare sector. It demonstrates how CRISP-ML addressed the problems with data diversity, the unstructured nature of data, and relatively low linkage between diverse data sets in the healthcare domain. The characteristics of the case study, used in the study, are typical for healthcare data, and CRISP-ML managed to deliver on these issues, ensuring the required level of interpretability of the ML solutions discussed in the project. The approach used ensured that interpretability requirements were met, taking into account public healthcare specifics, regulatory requirements, project stakeholders, project objectives, and data characteristics. The study concludes with the three main directions for the development of the presented cross-industry standard process.

Highlights

AND BACKGROUND TO THE PROBLEMContemporary data collection and linking capabilities, combined with the growing diversity of the data-driven artificial intelligence (AI) techniques, including machine learning (ML) techniques, and the broader deployment of these techniques in data science and analytics, have had a profound impact on decision-making across many areas of human endeavors
Among these properties of ML solutions, interpretability is important for human-centric areas like healthcare, where it is crucial for the end users to have access to an accurate model and to trust the validity and accuracy of the model, as well as understand how the model works, what recommendation has been made by the model, and why
We focus on a single case study from health-related domain in order to present a comprehensive coverage of each stage and the connections between the stages, and provide examples of how the required level of interpretability of the solution is achieved through carefully crafted involvement of the stakeholders as well as decisions made at each stage

Summary

INTRODUCTION

Contemporary data collection and linking capabilities, combined with the growing diversity of the data-driven artificial intelligence (AI) techniques, including machine learning (ML) techniques, and the broader deployment of these techniques in data science and analytics, have had a profound impact on decision-making across many areas of human endeavors. On the other hand, working with data in the healthcare domain is complex at every step, starting from establishing and finding the relevant, typically numerous, diverse, and heterogeneous data sources required to address the research objective; integrating and mapping these data sources; identifying and resolving data quality issues; pre-processing and feature engineering without losing information or distorting it; and using the resulting high-dimensional, complex, sometimes unstructured, data to build a high-performing interpretable model This complexity further supports the argument for the development of ML methodologies which explicitly embed interpretability through the data science project life cycle and ensure the achievement of the level of interpretability of ML solutions that had been agreed for the project. We use these arguments as dimensions around which we elaborate the challenges and opportunities for the design of cross-industry data science methodology, which is capable of handling interpretability of ML solutions under the complexity of the healthcare domain

High Proportion of Data Science Project Failures

Consistent Measurement and Evaluation of Interpretability of ML Solutions

The Emerging Need for Standard Methodology for Handling Interpretability

CRISP-ML METHODOLOGY—TOWARD INTERPRETABILITY-CENTRIC CREATION OF ML SOLUTIONS

Building the Project Interpretability Matrix

Interpretability-Related Aspects of the Project Charter

Entries to the Project Interpretability Matrix at Each Stage of CRISP-ML

Creating the Project IM

CONCLUSIONS

Findings

DATA AVAILABILITY STATEMENT

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Interpretability of Machine Learning Solutions in Public Healthcare: The CRISP-ML Approach.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in big data

Lead the way for us

Similar Papers

Balancing accuracy and interpretability of machine learning approaches for radiation treatment outcomes modeling.
Randall K Ten Haken ... Lise Wei
BJR open | VOL. 1
Randall K Ten Haken, et. al.Randall K Ten Haken ... Lise Wei
04 Jul 2019
BJR open | VOL. 1

Interpretability and accessibility of machine learning in selected food processing, agriculture and health applications
A Ramanan ... M Niranjan
Journal of the National Science Foundation of Sri Lanka | VOL. 50
A Ramanan, et. al.A Ramanan ... M Niranjan
10 Nov 2022
Journal of the National Science Foundation of Sri Lanka | VOL. 50

Interpretable and Explainable Machine Learning for Materials Science and Chemistry
Tonio Buonassisi ... Felipe Oviedo
Accounts of Materials Research | VOL. 3
Tonio Buonassisi, et. al.Tonio Buonassisi ... Felipe Oviedo
03 Jun 2022
Accounts of Materials Research | VOL. 3

Identification of advanced spin-driven thermoelectric materials via interpretable machine learning
Yuma Iwasaki ... Ryohto Sawada
npj Computational Materials | VOL. 5
Yuma Iwasaki, et. al.Yuma Iwasaki ... Ryohto Sawada
30 Oct 2019
npj Computational Materials | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interpretability of Machine Learning Solutions in Public Healthcare: The CRISP-ML Approach.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in big data