e18059 Background: Biological behaviors of cancers are extremely complex and governed via nonlinear interplay of thousands of genes. This complexity has become a major hurdle to generating mechanistic conclusions in rare clinical scenarios, such as treatment resistant HPV+ head and neck squamous cell carcinoma, when observations are limited. Recently, data augmentation algorithms have served to increase the number and diversity of training observations while preserving the underlying structure and nonlinear relationships of data features. Deep learning techniques such as generative adversarial neural networks (GANs) have shown promise as robust tools for data augmentation of nonimage data. This study aims to evaluate the utility of employing separate GANs to highlight gene expression differences between treatment sensitive (TS) vs treatment resistant (TR) HPV+ head and neck squamous cell carcinoma. Methods: We performed RNAseq on 17 patients with HPV-positive oropharyngeal squamous cell carcinoma (OPSCC). We defined TR patients as those with locally advanced OPSCC at presentation, who rapidly recurred following receipt of standard of care therapy (n = 11). In most cases, patients failed to respond to salvage therapies and rapidly died. In contrast, we defined TS patients as those with locally advanced OPSCC at presentation, who received standard of care therapy, and were followed for 5 years without any evidence of recurrence (n = 6). These two cohorts were used to train two separate GANs, which synthesized 500 robust synthetic samples of each phenotype for downstream analysis. Results: Analysis of 17 samples yielded less than 100 differentially expressed (DE) genes passing FDR cutoffs, inhibiting pathway enrichment analysis (PAE). DE testing between synthetic samples yielded 1196 DE genes. Validating our deep learning approach, these genes belong to coherent pathways as determined by PAE. Namely, the VEGFA-VEGFR2 pathway was strongly enriched in the TR population (p = 1.5 x 10-4). Further, the NKT cell signature (p = 3.0x10-62), and the gene library of T-cell receptor induction of apoptosis (p = 5.4 x 10-9), was enriched in the TS population. Lastly, to validate the experimental method, DE testing on GAN augmented cohorts of renal papillary vs renal clear cell samples from The Cancer Genome Atlas, trained on 10 samples per condition, reliably predicted DE genes when comparing the full cohort of renal papillary n = 534 vs renal clear cell n = 295 samples, with an accuracy of 93% and F1 score of 88%. Conclusions: To our knowledge this is the first study to utilize separate GANs to augment structural gene expression changes occurring in rare clinical scenarios such as treatment resistant HPV+ OPSCC. Utilizing this method uncovers changes in angiogenesis and T-cell function between treatment sensitive and resistant HPV+ OPSCC.
Read full abstract