Abstract Purpose of Study: In this study we demonstrate accurate prediction of the impact of somatic mutations on the HLA presentation landscape achieved by interrogating a large scale database of 1.4 million unique HLA peptide sequences that have been directly identified by mass spectrometry. Background: Peptides presented to the immune system on HLA complexes are valuable targets for immunotherapeutic treatments. Identifying the full complement of peptides derived from a particular protein that are presented on major class I HLA restrictions will provide a vital step toward increasing the speed and viability of many immunotherapeutic strategies. Advances in next-generation sequencing (NGS) and single-cell technologies have enabled the accurate capture of somatic mutations accumulated by a tumor, yet a significant hurdle remains how this information can be utilized for immunotherapeutic benefit. In particular, identifying which somatic mutations produce neoantigens (peptides that contain a somatic mutation and are presented to the immune system in complex with HLA) is crucial to linking genetic changes with immunologic impact. Materials and Methods: Our approach to understanding the targetable human HLA peptidome is based on three key principles: achieving full proteome coverage, maximising individual protein coverage, and focusing on dominant HLA restrictions. By integrating novel cell biology, mass spectrometry, and bioinformatic technologies across over 1,000 individual experiments we have dramatically increased the depth of the HLA ligandome captured and achieved near total coverage of the protein-coding genome. Over 90% of the proteome has been captured for the restriction HLA-A*02:01, dominant in Caucasian populations. Our comprehensive genome coverage has enabled us to probe both directly and indirectly for the presence of neoantigens. Known somatic mutations within immortalized lines were used to generate bespoke reference databases that has led to direct identification of many hundreds of neoantigens. Results: Proteins that were found to contain neoantigens appeared to follow the same pattern of antigen processing and presentation as their unmutated equivalents. We have therefore found our HLA peptide dataset is able to offer significant value in predicting the likelihood of a somatic mutation creating a neoantigen. To test this, somatic mutations reported in 980 cell lines were probed against the database of HLA peptides. On average we find one peptide containing the mutated amino acid for every five somatic mutations reported. By incorporating the HLA background of the cell carrying the mutation, we narrow this prediction to one high-affinity HLA peptide for every fourteen somatic mutations reported. Comparing the peptides predicted in this analysis with those directly identified by mass spectrometry, we are able to show that we can prioritize mutation data by accurately predicting the presence and relative abundance of neoantigens. Our neoantigen prediction process is fully incorporated into a large scale database system, enabling us to seamlessly integrate NGS data from individual tissue and use peptidomic data to rapidly define the targetable landscape of an individual. Conclusions: An integrative approach to HLA peptidomics has delivered a powerful reference database for developing novel immunotherapies. Citation Format: Alex S. Powlesland, Geert P.M. Mommen, Ricardo J. Carreira, Jacob Hurst, Michael J. Cundell, David Lowne, Floriana Capuano, Bent K. Jakobsen. Exploiting large-scale HLA peptidomics to generate novel immunotherapies: A data-driven approach to true neoantigen prioritization [abstract]. In: Proceedings of the Fourth CRI-CIMT-EATI-AACR International Cancer Immunotherapy Conference: Translating Science into Survival; Sept 30-Oct 3, 2018; New York, NY. Philadelphia (PA): AACR; Cancer Immunol Res 2019;7(2 Suppl):Abstract nr B086.
Read full abstract