For cases of multilevel lumbar disc herniation (LDH), selecting the surgical approach for Percutaneous Transforaminal Endoscopic Discectomy (PTED) presents significant challenges and heavily relies on the physician's judgment. This study aims to develop a deep learning (DL)-based multimodal model that provides objective and referenceable support by comprehensively analyzing imaging and clinical data to assist physicians. This retrospective study collected imaging and clinical data from patients with multilevel LDH. Each segmental MR scan was concurrently fed into a multi-input ResNet 50 model to predict the target segment. The target segment scan was then input to a custom model to predict the PTED approach direction. Clinical data, including the patient's lower limb sensory and motor functions, were used as feature variables in a machine learning (ML) model for prediction. Bayesian optimization was employed to determine the optimal weights for the fusion of the two models. The predictive performance of the multimodal model significantly outperformed the DL and ML models. For PTED target segment prediction, the multimodal model achieved an accuracy of 93.8%, while the DL and ML models achieved accuracies of 87.7% and 87.0%, respectively. Regarding the PTED approach direction, the multimodal model had an accuracy of 89.3%, significantly higher than the DL model's 87.8% and the ML model's 87.6%. The multimodal model demonstrated excellent performance in predicting PTED target segments and approach directions. Its predictive performance surpassed that of the individual DL and ML models.
Read full abstract