Abstract

To develop and validate a multiparametric MRI-based radiomics model with optimal oversampling and machine learning techniques for predicting human papillomavirus (HPV) status in oropharyngeal squamous cell carcinoma (OPSCC). This retrospective, multicenter study included consecutive patients with newly diagnosed and pathologically confirmed OPSCC between January 2017 and December 2020 (110 patients in the training set, 44 patients in the external validation set). A total of 293 radiomics features were extracted from three sequences (T2-weighted images [T2WI], contrast-enhanced T1-weighted images [CE-T1WI], and ADC). Combinations of three feature selection, five oversampling, and 12 machine learning techniques were evaluated to optimize its diagnostic performance. The area under the receiver operating characteristic curve (AUC) of the top five models was validated in the external validation set. A total of 154 patients (59.2 ± 9.1years; 132 men [85.7%]) were included, and oversampling was employed to account for data imbalance between HPV-positive and HPV-negative OPSCC (86.4% [133/154] vs. 13.6% [21/154]). For the ADC radiomics model, the combination of random oversampling and ridge showed the highest diagnostic performance in the external validation set (AUC, 0.791; 95% CI, 0.775-0.808). The ADC radiomics model showed a higher trend in diagnostic performance compared to the radiomics model using CE-T1WI (AUC, 0.604; 95% CI, 0.590-0.618), T2WI (AUC, 0.695; 95% CI, 0.673-0.717), and a combination of both (AUC, 0.642; 95% CI, 0.626-0.657). The ADC radiomics model using random oversampling and ridge showed the highest diagnostic performance in predicting the HPV status of OPSCC in the external validation set. Among multiple sequences, the ADC radiomics model has a potential for generalizability and applicability in clinical practice. Exploring multiple oversampling and machine learning techniques was a valuable strategy for optimizing radiomics model performance. • Previous radiomics studies using multiparametric MRI were conducted at single centers without external validation and had unresolved data imbalances. • Among the ADC, CE-T1WI, and T2WI radiomics models and the ADC histogram models, the ADC radiomics model was the best-performing model for predicting human papillomavirus status in oropharyngeal squamous cell carcinoma. • The ADC radiomics model with the combination of random oversampling and ridge showed the highest diagnostic performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call