Abstract

Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology. Still, the identification of non-tryptic peptides presents substantial computational challenges. To address these, we synthesized and analyzed >300,000 peptides by multi-modal LC-MS/MS within the ProteomeTools project representing HLA class I & II ligands and products of the proteases AspN and LysN. The resulting data enabled training of a single model using the deep learning framework Prosit, allowing the accurate prediction of fragment ion spectra for tryptic and non-tryptic peptides. Applying Prosit demonstrates that the identification of HLA peptides can be improved up to 7-fold, that 87% of the proposed proteasomally spliced HLA peptides may be incorrect and that dozens of additional immunogenic neo-epitopes can be identified from patient tumors in published data. Together, the provided peptides, spectra and computational tools substantially expand the analytical depth of immunopeptidomics workflows.

Highlights

  • Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology

  • We reasoned that generating highquality LC-MS/MS data for a representative set of synthetic peptides with precisely known sequences may form the basis for the computational prediction of spectra and chromatographic retention times for any peptide

  • The HLA sequences were selected from published HLA ligandomes[11,12,13] (Fig. 1b) and AspN/LysN peptides were drawn from a large unpublished study

Read more

Summary

Introduction

Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology. Applying Prosit demonstrates that the identification of HLA peptides can be improved up to 7-fold, that 87% of the proposed proteasomally spliced HLA peptides may be incorrect and that dozens of additional immunogenic neo-epitopes can be identified from patient tumors in published data. Compared to the analysis of (mostly) tryptic peptides in proteomics, the analysis of HLA peptides poses substantial challenges These arise from the fact that HLA peptides are generated by unspecific protease cleavage. This alters the characteristics of the tandem mass spectra (i.e., the type and intensity of the fragment ions observed), and vastly increases the number of sequences a search engine has to consider as a possible match to a particular MS/MS spectrum[5]. That expanding our deep learning framework Prosit[7] to the interpretation of nontryptic peptides greatly improves the identification of HLA peptides and neo-epitopes and we demonstrate that proteasomal splicing of peptides is much rarer than anticipated

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call