Abstract

<h3>Purpose/Objective(s)</h3> Treatment for HPV-associated oropharyngeal carcinoma (HPV-OPC) is associated with excellent disease control, but high treatment-related morbidity, leading to interest in de-escalation strategies. ECOG-ACRIN Cancer Research Group E3311 is a Phase II trial wherein HPV-OPC patients were treated with transoral robotic surgery and randomized postoperatively to de-escalated radiotherapy, so long as they lacked high-risk pathologic criteria, including macroscopic (>1 mm) extranodal extension (ENE). The protocol excluded patients with radiographic or clinically matted nodes, yet >30% of enrollees demonstrated ENE and required adjuvant chemoradiation. In prior work, we developed and validated a deep learning algorithm (DLA) to predict pathologic ENE based on diagnostic computed tomography (CT). We hypothesized that DLA could generalize to the HPV-OPC E3311 patient population, a diagnostically challenging cohort where use could be particularly impactful. <h3>Materials/Methods</h3> We obtained pretreatment CTs and corresponding surgical pathology reports from E3311 as available. The lymph node with largest short-axis diameter (SAD) and up to two additional nodes were segmented on each scan and annotated for ENE based on pathology reports. DLA performance for ENE prediction was compared directly to four board-certified head and neck radiologists (R1-4) who were provided with a validated educational tool to help diagnose ENE. Primary endpoint was area under the receiver operating characteristic curve (AUC), compared via the Delong method. Secondary measures were sensitivity (1 - false negative rate) and specificity (1 - false positive rate (FPR)). Interrater agreement was evaluated with Fleiss Kappa. <h3>Results</h3> From 177 collected scans, 311 lymph nodes were annotated: 71 (23%) with ENE, 39 (13%) with >1 mm ENE. Of nodes, 202 (65%) had SAD ≥1 cm. DLA AUC for ENE classification was .85 [95%CI: .80 - .89], outperforming R1-4 (Table, P<.001 for each). Reader specificity and sensitivity showed high variability with poor interrater agreement (Table, Kappa: 0.28). Matching DLA specificity to that of the reader with highest AUC (R2) (FPR<22%) yielded improved sensitivity to 75% (+13%). Setting DLA FPR to <30% yielded sensitivity of 87% (Table). DLA also showed improved performance versus R1-4 for >1 mm ENE (AUC: .85 [95%CI: .80 - .89] vs mean 0.68 [95%CI: .62 - .74], P<.001) and for nodes with SAD ≥1 cm (AUC: .74 [95%CI: .67 - .81]) vs mean AUC: .58 [95%CI: .55 - .62], P<.001). <h3>Conclusion</h3> DLA showed high performance in predicting pathologic microscopic and macroscopic ENE on a challenging cohort of patients with HPV-OPC from a prospective clinical trial, substantially outperforming expert head and neck radiologists from tertiary care centers. DLA should be evaluated in a clinical trial with the goal of reducing trimodality therapy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call