Abstract T cell inducing vaccines are key for the development of effective therapies against cancer and infectious diseases. Peptides, presented by Human Leukocyte Antigens (HLAs), are the targets of T cells, and their identification is therefore critical to the development of such vaccines. Here, we present the latest improvement in our EDGETM (Epitope Discovery for GEnomes) platform to address this critical need. EDGE is comprised of AI models that can predict peptide presentation by HLA class I and class II. Although the models are trained primarily using immunopeptidomics data, EDGE scores are predictive of peptide-HLA immunogenicity. There are three class I presentation models in EDGE: an allele-specific model, a pan-specific model, and a model specific for infectious diseases. The allele-specific model is applicable to a large but pre-defined set of HLA alleles. On a large test dataset, the allele-specific model achieved an average precision (AP) of 63% (PPV40=79%) compared to the AP of a standard best-available public model of 21% (PPV40=28%). A Ph1/2 clinical study of personalized cancer vaccines encoding neoantigens predicted from the allele-specific model demonstrated a ~50% molecular response (defined as >=30% reduction in circulating tumor DNA relative to baseline) rate with associated extended overall survival (vs non-responders) in metastatic, microsatellite stable colorectal cancer patients. We observed that >50% of the mutations were able to elicit T cell responses. The pan-specific class I model uses HLA sequences as input feature when training and, therefore, is applicable to any HLA. On the same test dataset as above, it achieved an AP of 65% (PPV40=81%) and performed better on average for ~40 less-common HLA alleles. Prediction of viral peptide presentation by HLA class I is challenging due to the lack of immunopeptidomics data. The class I model for infectious diseases was specifically optimized to predict for viral peptides and, therefore, performed better than available class I models on published HIV and Influenza A datasets. Prediction of peptide presentation by HLA class II is challenging due to the flexibility in how the longer peptides interact with open HLA grooves as well as the lack of immunopeptidomics data as compared to the class I peptides. The class II model in EDGE, EDGE-II, uses the latest developments in protein large language models, a novel learned HLA allele-deconvolution strategy, and in-house immunopeptidomics data, resulting in improved prediction of peptide presentation by HLA class II and immunogenicity driven by CD4+ T cells. On a benchmark validation dataset, EDGE-II achieved an AP of 71% as compared to AP of 62% of a leading published model. In summary, EDGETM provides a comprehensive state-of-the-art platform for the development of vaccines that can induce both CD8+ and CD4+ T cell responses to provide durable benefit to patients. Citation Format: Joshua Klein, Daniel Sprague, Monica Lane, Meghan Hart, Olivia Petrillo, Italo Faria do Valle, Matthew Davis, Andrew Ferguson, Andrew Allen, Karin Jooss, Ankur Dhanik. AI platform provides an EDGE and enables state-of-the-art identification of peptide-HLAs for the development of T cell inducing vaccines [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 904.
Read full abstract