<h3>Purpose/Objective(s)</h3> P16 immunohistochemistry (IHC) is not as specific for detecting human papillomavirus (HPV) related oropharyngeal cancers (OPC) as in situ hybridization (ISH). However, given its high sensitivity and wide availability, P16 IHC is a standard of care requirement for OPC staging and is also used for clinical trial eligibility. As both P16 and HPV testing are not required for routine clinical management, there are varying reports on the prognosis of P16 and HPV discordant tumors. We aimed to assess the prognosis of P16 and HPV discordant OPC tumors, using natural language processing (NLP) techniques to extract results from free-text OPC pathology reports. <h3>Materials/Methods</h3> We reviewed our institutional OPC radiation database and included a series of patients from 7/1994-6/2020 with digitized pathology reports. Patients were excluded if they did not receive curative intent radiation, did not have both P16 IHC and HPV ISH, or if either test was equivocal. 422 patients were included with a median follow-up of 87.8 months (0.3-241.0). We performed text extraction and classification to ascertain P16 and HPV results from free-text pathology reports. All reports were manually evaluated by two independent reviewers as our gold standard. Four groups were identified: P16-negative/HPV-negative (P16-/HPV-, n=60), P16-/HPV-positive (P16-/HPV+, n=9), P16-positive/HPV- (P16+/HPV-, n=49), and P16+/HPV+ (P16+/HPV+, n=304). The Kaplan-Meier method was used to estimate cancer-specific survival (CSS) and overall survival (OS). <h3>Results</h3> 39% of reports could not be processed by our NLP techniques. The algorithm accurately coded 92.6% of P16 and 94.2% of HPV results. Positive predictive value/precision, sensitivity/recall, and F-score for P16/HPV were: 99.5%/99.0%, 91.7%/94.0%, and 95.5%/96.5%. 13.7% of tumors were discordant (P16-/HPV+ or P16+/HPV-). As expected, P16-/HPV- tumors had the worst CSS and OS, and P16+/HPV+ had excellent prognosis (Table 1). P16-/HPV+ constituted only 2.1% of the cohort but outcomes performed favorably with P16+/HPV+. P16+/HPV- tumors had favorable outcomes in the first two years, but with longer follow-up clustered between P16-/HPV- and P16+/HPV+ tumors. <h3>Conclusion</h3> P16 and HPV results were accurately abstracted from the majority of free-text OPC pathology reports. We provide the first example to our knowledge of employing NLP techniques to extract data from head and neck cancer pathology reports and correlate results with clinical outcomes. P16 and HPV discordant tumors constitute a minority of patients, but their divergent prognoses suggest that performing both tests may be of utility—especially for clinical trial eligibility or treatment de-escalation.
Read full abstract