Abstract

The vast majority of tools for neoepitope prediction from DNA sequencing of complementary tumor and normal patient samples do not consider germline context or the potential for the co-occurrence of two or more somatic variants on the same mRNA transcript. Without consideration of these phenomena, existing approaches are likely to produce both false-positive and false-negative results, resulting in an inaccurate and incomplete picture of the cancer neoepitope landscape. We developed neoepiscope chiefly to address this issue for single nucleotide variants (SNVs) and insertions/deletions (indels). Herein, we illustrate how germline and somatic variant phasing affects neoepitope prediction across multiple datasets. We estimate that up to ∼5% of neoepitopes arising from SNVs and indels may require variant phasing for their accurate assessment. neoepiscope is performant, flexible and supports several major histocompatibility complex binding affinity prediction tools. neoepiscope is available on GitHub at https://github.com/pdxgx/neoepiscope under the MIT license. Scripts for reproducing results described in the text are available at https://github.com/pdxgx/neoepiscope-paper under the MIT license. Additional data from this study, including summaries of variant phasing incidence and benchmarking wallclock times, are available in Supplementary Files 1, 2 and 3. Supplementary File 1 contains Supplementary Table 1, Supplementary Figures 1 and 2, and descriptions of Supplementary Tables 2-8. Supplementary File 2 contains Supplementary Tables 2-6 and 8. Supplementary File 3 contains Supplementary Table 7. Raw sequencing data used for the analyses in this manuscript are available from the Sequence Read Archive under accessions PRJNA278450, PRJNA312948, PRJNA307199, PRJNA343789, PRJNA357321, PRJNA293912, PRJNA369259, PRJNA305077, PRJNA306070, PRJNA82745 and PRJNA324705; from the European Genome-phenome Archive under accessions EGAD00001004352 and EGAD00001002731; and by direct request to the authors. Supplementary data are available at Bioinformatics online.

Highlights

  • While mutations may promote oncogenesis, cancer-specific variants and the corresponding novel peptides they may produce (“neoepitopes”) appear central to the generation of adaptive anti-tumor immune response [1]

  • As ~15% of neoepitopes are estimated to result from other types of mutations [3], additional tools were developed to predict neoepitopes from gene fusions (e.g. INTEGRATE-neo [4]), non-stop mutations (e.g. TSNAD [5]), and insertions and deletions, which may be of particular significance for anticipating cancer immunotherapy response [8]

  • We found that variant co-occurrence increases approximately linearly with increasing nucleotide span, with an overall average of 2.72% and 0.43% of somatic variants located within a 72bp inter-variant distance of another germline or somatic variant, respectively (Figure 3A)

Read more

Summary

Introduction

While mutations may promote oncogenesis, cancer-specific variants and the corresponding novel peptides they may produce (“neoepitopes”) appear central to the generation of adaptive anti-tumor immune response [1]. As ~15% of neoepitopes are estimated to result from other types of mutations [3], additional tools were developed to predict neoepitopes from gene fusions (e.g. INTEGRATE-neo [4]), non-stop mutations (e.g. TSNAD [5]), and insertions and deletions (indels, from e.g. pVACseq [6], MuPeXI [7]), which may be of particular significance for anticipating cancer immunotherapy response [8]. Many of these tools enable comparable predictive capabilities, but each approach has its own unique set of features and limitations (See Figure 1)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call