Abstract

BackgroundMachine learning (ML) can be an effective tool to extract information from attribute-rich molecular datasets for the generation of molecular diagnostic tests. However, the way in which the resulting scores or classifications are produced from the input data may not be transparent. Algorithmic explainability or interpretability has become a focus of ML research. Shapley values, first introduced in game theory, can provide explanations of the result generated from a specific set of input data by a complex ML algorithm.MethodsFor a multivariate molecular diagnostic test in clinical use (the VeriStrat® test), we calculate and discuss the interpretation of exact Shapley values. We also employ some standard approximation techniques for Shapley value computation (local interpretable model-agnostic explanation (LIME) and Shapley Additive Explanations (SHAP) based methods) and compare the results with exact Shapley values.ResultsExact Shapley values calculated for data collected from a cohort of 256 patients showed that the relative importance of attributes for test classification varied by sample. While all eight features used in the VeriStrat® test contributed equally to classification for some samples, other samples showed more complex patterns of attribute importance for classification generation. Exact Shapley values and Shapley-based interaction metrics were able to provide interpretable classification explanations at the sample or patient level, while patient subgroups could be defined by comparing Shapley value profiles between patients. LIME and SHAP approximation approaches, even those seeking to include correlations between attributes, produced results that were quantitatively and, in some cases qualitatively, different from the exact Shapley values.ConclusionsShapley values can be used to determine the relative importance of input attributes to the result generated by a multivariate molecular diagnostic test for an individual sample or patient. Patient subgroups defined by Shapley value profiles may motivate translational research. However, correlations inherent in molecular data and the typically small ML training sets available for molecular diagnostic test development may cause some approximation methods to produce approximate Shapley values that differ both qualitatively and quantitatively from exact Shapley values. Hence, caution is advised when using approximate methods to evaluate Shapley explanations of the results of molecular diagnostic tests.

Highlights

  • Machine learning (ML) can be an effective tool to extract information from attribute-rich molecular datasets for the generation of molecular diagnostic tests

  • We investigated three expressions proposed to characterize the importance of pairs of features, or interactions, for classification: Shapley interaction indices [14], A main effects term for i = j [16] can be defined as STIIii f = [f ({i}) − f ({∅}) ]

  • Exact Shapley interaction indices and Harsanyi dividends To assess the importance of pairs of features to the classification from the VS algorithm for each instance, we evaluated three previously proposed quantities: SIIs [14], ShapleyTaylor interaction indices (STII) [16], and HDs [7]. (Note that while SIIs and STIIs evaluate the contribution of features i and j in the context of coalitions of other features, HDs only consider features i and j in isolation.) The results are shown in the heatmap of Fig. 5 for all pairs of distinct features, i, j i = j for six instances: a uniform Good, a non-uniform Good and a boundary Good instance and corresponding examples of Poor instances

Read more

Summary

Introduction

Machine learning (ML) can be an effective tool to extract information from attribute-rich molecular datasets for the generation of molecular diagnostic tests. First introduced in game theory, can provide explanations of the result generated from a specific set of input data by a complex ML algorithm. Methods: For a multivariate molecular diagnostic test in clinical use (the VeriStrat® test), we calculate and discuss the interpretation of exact Shapley values. Roder et al BMC Med Inform Decis Mak (2021) 21:211 diagnostic tests produced from large numbers of attributes via ML can be effective predictors of outcome, making use of the information in these highly multivariate data inputs to improve performance and robustness. Neither the way in which the tests produce a result for a given patient, nor the biological rationale underlying the tests may be transparent. Concerns about biases in ML implementations, including those containing the attributes gender or race [2, 3], and the recognition of the right of individuals to understand how their personal data is being used, have highlighted the need for interpretable explanations and quantification of how attributes are used by complex ML algorithms [4]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call