Abstract

Differentiation between Crohn’s disease and intestinal tuberculosis is difficult but crucial for medical decisions. This study aims to develop an effective framework to distinguish these two diseases through an explainable machine learning (ML) model. After feature selection, a total of nine variables are extracted, including intestinal surgery, abdominal, bloody stool, PPD, knot, ESAT-6, CFP-10, intestinal dilatation and comb sign. Besides, we compared the predictive performance of the ML methods with traditional statistical methods. This work also provides insights into the ML model’s outcome through the SHAP method for the first time. A cohort consisting of 200 patients’ data (CD = 160, ITB = 40) is used in training and validating models. Results illustrate that the XGBoost algorithm outperforms other classifiers in terms of area under the receiver operating characteristic curve (AUC), sensitivity, specificity, precision and Matthews correlation coefficient (MCC), yielding values of 0.891, 0.813, 0.969, 0.867 and 0.801 respectively. More importantly, the prediction outcomes of XGBoost can be effectively explained through the SHAP method. The proposed framework proves that the effectiveness of distinguishing CD from ITB through interpretable machine learning, which can obtain a global explanation but also an explanation for individual patients.

Highlights

  • Differentiation between Crohn’s disease and intestinal tuberculosis is difficult but crucial for medical decisions

  • The main contribution of this research is as follows: (1) This paper proposed an effective framework to addresses a real-world problem, differentiating Crohn’s disease (CD) from Intestinal tuberculosis (ITB); (2) This framework can improve the predictive performance combing with SMOTE algorithm and machine learning; (3) Our framework provide local interpretation and direct results of visualization without losing the classification accuracy based on a model-independent interpretable machine learning algorithm; (4) As for as we know, it is the first time to develop a interpretable machine learning framework to distinguish CD from ITB, which may improve medical workers’ acceptance of prediction outcomes

  • PPD and the tuberculosis (TB) interferon-gamma (IFN-γ) release assay (TB-IGRA) are both associated with mycobacteria, and have a relatively high sensitivity and specificity for the diagnosis of ITB, especially TB-IGRA

Read more

Summary

Introduction

Differentiation between Crohn’s disease and intestinal tuberculosis is difficult but crucial for medical decisions. This study aims to develop an effective framework to distinguish these two diseases through an explainable machine learning (ML) model. That some clinical presentation, radiological, endoscopic and histological features can improve the diagnostic accuracy of CD and ITB. The statistical theory has provided a great variety of methods, those were used to determine sensitivity indices and improve the diagnostic accuracy of CD and I­ TB3,13; The logistic regression model (LOG) is the most popular. What’s more, the LOG model has the significant superiority of easy interpretation for its results, providing a straightforward probability for individual patients. With these advantages, LOG has gradually been regarded as a scoring method to diagnose diseases. Those methods still have several limitations: (1) they may be challenging to imitate the complex nonlinear interaction between variables, and (2) they have a high sensitivity to abnormal values (3) they are difficult to solve the problem of imbalance

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call