Recently, machine learning has revolutionized medical diagnosis and prediction, rivaling human experts in accuracy. However, the limited interpretability poses challenges in meeting stringent medical decision standards. To facilitate interpretable predictive inference, this article proposes a small samples-oriented intrinsically explainable machine learning model, an Optimal Variational Bayesian Logistic Regression (OVBLR) model with a salient feature estimation strategy (OVBLR-SFE). OVBLR-SFE sequentially learns from limited labeled samples, mining disease-related risk factors to provide intrinsic interpretability to its predictive inference automatically. The model achieves this by reasonably approximating the posterior probability through the proposed OVBLR model to estimate the feature weight and influential effect on model outputs to facilitate salient feature selection. The effectiveness and superiority of OVBLR-SFE are demonstrated through extensive experiments on four benchmark medical datasets from the UCI repository, including the Bone Marrow Transplant: Children Data Set, Breast Cancer Wisconsin (Original) Data Set, SPECT Heart Data Set, and Heart Disease Data Set, as well as a real-world case of intensive care unit readmission prediction. Experimental results reveal that the OVBLR-SFE can achieve high accuracies of 92.02%, 97.21%, 86.71%, and 84.47%, respectively, with an average classification accuracy of 90.10% stably on the four UCI datasets. Especially, OVBLR-SFE can accurately infer significant features that align with that obtained by the post-hoc interpretation approaches: SHAP and AcME, demonstrating remarkable interpretability. In the real application, the proposed method achieved a readmission prediction accuracy of 88.11% in predicting the readmission probabilities of liver transplantation patients. Notably, the inferred salient features, such as total Intensive Care Unit length of stay (ICUTime), N-terminal pro-brain natriuretic peptide(NTproBNP), Operating time of surgery(OperationTime_h), and intraoperative red blood cell transfusion (RBCT), emerges as crucial in readmission prediction, aligning with the assessments of clinical experts. The source code of this study can be accessed at https://github.com/xqw42/OVBLR-SFE.
Read full abstract