MediBoost: a Patient Stratification Tool for Interpretable Decision Making in the Era of Precision Medicine.

Gilmer Valdes,Lyle H Ungar,Charles B Simone,Timothy D Solberg,José Marcio Luna,Eric Eaton

doi:10.1038/srep37854

Abstract

Machine learning algorithms that are both interpretable and accurate are essential in applications such as medicine where errors can have a dire consequence. Unfortunately, there is currently a tradeoff between accuracy and interpretability among state-of-the-art methods. Decision trees are interpretable and are therefore used extensively throughout medicine for stratifying patients. Current decision tree algorithms, however, are consistently outperformed in accuracy by other, less-interpretable machine learning models, such as ensemble methods. We present MediBoost, a novel framework for constructing decision trees that retain interpretability while having accuracy similar to ensemble methods, and compare MediBoost’s performance to that of conventional decision trees and ensemble methods on 13 medical classification problems. MediBoost significantly outperformed current decision tree algorithms in 11 out of 13 problems, giving accuracy comparable to ensemble methods. The resulting trees are of the same type as decision trees used throughout clinical practice but have the advantage of improved accuracy. Our algorithm thus gives the best of both worlds: it grows a single, highly interpretable tree that has the high accuracy of ensemble methods.

Highlights

Patient stratification involves the integration of complex data structures that include gene-expression patterns, individual proteins, proteomics patterns, metabolomics, histology or imaging[2], all of which machine learning algorithms can correctly analyze
We present a framework for constructing decision trees that have equivalent accuracy to ensemble methods while maintaining high interpretability
If the area under the curve (AUC) are compared using the Wilcoxon sign rank test with the Bonferroni adjustment for multiple comparison, MediBoost is significantly better than ID3 (p = 8.69 × 10−10 ) and CART (p = 8.89 × 10−9) but not significantly different from LogitBoost (p = 0.85)

Summary

Introduction

Patient stratification involves the integration of complex data structures that include gene-expression patterns, individual proteins, proteomics patterns, metabolomics, histology or imaging[2], all of which machine learning algorithms can correctly analyze. Other sources of information such as those from electronic medical records, scientific literature, and physician experience and intuition, are more difficult to integrate For this reason, interpretability is a core requirement for machine learned models used in medicine. A classifier is considered to be interpretable if one can explain its classification by a conjunction of conditional statements, i.e., if- rules, about the collected data, in our case, data used for patient stratification Under this definition, standard decision trees, such as those learned by ID3 or CART, are considered interpretable but ensemble methods are not. We present a framework for constructing decision trees that have equivalent accuracy to ensemble methods while maintaining high interpretability This unique combination of model accuracy and interpretability addresses a long-standing challenge in machine learning that is essential for medical applications. The applications of our algorithm are not limited to the medical field; it could be used in any other application that employs decision trees

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Nov 30, 2016
Citations: 101	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

MediBoost: a Patient Stratification Tool for Interpretable Decision Making in the Era of Precision Medicine.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Ensembles of Decision Trees for Imbalanced Data
Juan J Rodríguez ... José F Díez-Pastor
-
Juan J Rodríguez, et. al.Juan J Rodríguez ... José F Díez-Pastor
01 Jan 2010
01 Jan 2010

Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.
Alok Kumar Chowdhury ... Dian Tjondronegoro
Medicine & Science in Sports & Exercise | VOL. 49
Alok Kumar Chowdhury, et. al.Alok Kumar Chowdhury ... Dian Tjondronegoro
01 Sep 2017
Medicine & Science in Sports & Exercise | VOL. 49

Detecting Chronic Kidney Disease Using Machine Learning
Manoj Reddy ... John Cho
-
Manoj Reddy, et. al.Manoj Reddy ... John Cho
01 Jan 2015
01 Jan 2015

Comparing performance of ensemble methods in predicting movie box office revenue
Sangjae Lee ... Joon Yeon Choeh
Heliyon | VOL. 6
Sangjae Lee, et. al.Sangjae Lee ... Joon Yeon Choeh
01 Jun 2020
Heliyon | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MediBoost: a Patient Stratification Tool for Interpretable Decision Making in the Era of Precision Medicine.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports