Abstract
IntroductionPredicting the conversion of clinically isolated syndrome (CIS) to clinically definite multiple sclerosis (CDMS) is critical to personalizing treatment planning and benefits for patients. The aim of this study is to develop an explainable machine learning (ML) model for predicting this conversion based on demographic, clinical, and imaging data. MethodThe ML model, Extreme Gradient Boosting (XGBoost), was employed on the public dataset of 273 Mexican mestizo CIS patients with 10-year follow-up. The data was divided into a training set for cross-validation and feature selection, and a holdout test set for final testing. Feature importance was determined using the SHapley Additive Explanations library (SHAP). Then, two experiments were conducted to optimize the model's performance by selectively adding variables and selecting the most contributive variables for the final model. ResultsNine variables including age, gender, schooling, motor symptoms, infratentorial and periventricular lesion at imaging, oligoclonal band in cerebrospinal fluid, lesion and symptoms types were significant. The model achieved an accuracy of 83.6 %, AUC of 91.8 %, sensitivity of 83.9 %, and specificity of 83.4 % in cross-validation. In the final testing, the model achieved an accuracy of 78.3 %, AUC of 85.8 %, sensitivity of 75 %, and specificity of 81.1 %. Finally, a web-based demo of the model was created for testing purposes. ConclusionThe model, focusing on feature selection and interpretability, effectively stratifies risk for treatment decisions and disability prevention in MS patients. It provides a numerical risk estimate for CDMS conversion, enhancing transparency in clinical decision-making and aiding in patient care.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have