Ischemic stroke (IS) has a high recurrence rate. Machine learning (ML) models have been developed based on single-modal biochemical tests, and imaging data have been used to predict stroke recurrence. However, the prediction accuracy of these models is not sufficiently high. Therefore, this study aimed to collect biochemical detection and magnetic resonance imaging (MRI) data to establish a dataset and propose a high-performance heterogeneous multimodal IS recurrence prediction model based on deep learning. This is a retrospective cohort study. Data were retrospectively collected from 634 IS patients in Zhuhai, China, a 12-month follow-up was conducted to determine stroke recurrence. We propose the ischemic stroke multi-group learning (ISGL) model, an integrated model for predicting the recurrence risk of multimodal IS in patients, based on a capsule neural network and a linear support vector machine (SVM). Two capsule neural network prediction models based on T1 and T2 signals in the MRI data and a SVM prediction model based on biochemical test data were established. Finally, a vote was conducted on the final judgment of the integrated model. The ISGL model was compared with 6 classical ML and deep learning models: k-nearest neighbors, SVM, logistic regression, random forest, eXtreme Gradient Boosting, and visual geometry group. The results revealed that the accuracy, specificity, sensitivity and the area under the curve of the ISGL model were 95%, 96%, 94%, and 95%, respectively. Among the comparison models, the visual geometry group method exhibited the best performance, but it much lower than those of the ISGL model. Analysis of the importance of biochemical test data revealed that low-density lipoprotein, smoking, and heart disease history were the positively correlated factors, and total cholesterol, high-density lipoprotein, and diabetes were and the negatively correlated factors. This study proposes the ISGL model can be used simultaneously with MRI and biochemical data to predict IS recurrence. This combination resulted in higher rate of performance than that of the other ML models. Additionally, this study found related risk factors affected recurrence, which can be used to intervene in high-risk patients' recurrence as early as possible and promote the development of secondary prevention of stroke.
Read full abstract