To develop and validate machine learning (ML) and deep learning (DL) models using drug-induced sleep endoscopy (DISE) images to predict the therapeutic efficacy of hypoglossal nerve stimulator (HGNS) implantation. Patients who underwent DISE and subsequent HGNS implantation at a tertiary care referral center were included. Six DL models and five ML algorithms were trained on images from the base of tongue (BOT) and velopharynx (VP) from patients classified as responders or non-responders as defined by Sher's criteria (50% reduction in apnea-hypopnea index (AHI) and AHI < 15 events/h). Precision, recall, F1 score, and overall accuracy were evaluated as measures of performance. In total, 25,040 images from 127 patients were included, of which 16,515 (69.3%) were from responders and 8,262 (30.7%) from non-responders. Models trained on the VP dataset had greater overall accuracy when compared to BOT alone and combined VP and BOT image sets, suggesting that VP images contain discriminative features for identifying therapeutic efficacy. The VCG-16 DL model had the best overall performance on the VP image set with high training accuracy (0.833), F1 score (0.78), and recall (0.883). Among ML models, the logistic regression model had the greatest accuracy (0.685) and F1 score (0.813). Deep neural networks have potential to predict HGNS therapeutic efficacy using images from DISE, facilitating better patient selection for implantation. Development of multi-institutional data and image sets will allow for development of generalizable predictive models. NA Laryngoscope, 134:5210-5216, 2024.
Read full abstract