Abstract

The set of peptides presented on a cell's surface by MHC molecules is known as the immunopeptidome. Current mass spectrometry technologies allow for identification of large peptidomes, and studies have proven these data to be a rich source of information for learning the rules of MHC-mediated antigen presentation. Immunopeptidomes are usually poly-specific, containing multiple sequence motifs matching the MHC molecules expressed in the system under investigation. Motif deconvolution -the process of associating each ligand to its presenting MHC molecule(s)- is therefore a critical and challenging step in the analysis of MS-eluted MHC ligand data. Here, we describe NNAlign_MA, a computational method designed to address this challenge and fully benefit from large, poly-specific data sets of MS-eluted ligands. NNAlign_MA simultaneously performs the tasks of (1) clustering peptides into individual specificities; (2) automatic annotation of each cluster to an MHC molecule; and (3) training of a prediction model covering all MHCs present in the training set. NNAlign_MA was benchmarked on large and diverse data sets, covering class I and class II data. In all cases, the method was demonstrated to outperform state-of-the-art methods, effectively expanding the coverage of alleles for which accurate predictions can be made, resulting in improved identification of both eluted ligands and T-cell epitopes. Given its high flexibility and ease of use, we expect NNAlign_MA to serve as an effective tool to increase our understanding of the rules of MHC antigen presentation and guide the development of novel T-cell-based therapeutics.

Highlights

  • The set of peptides presented on a cell’s surface by Major Histocompatibility Complex (MHC) molecules is poly-specific

  • A key issue associated with the interpretation and analysis of LC-MS MHC eluted ligand data sets (EL data) stems from the challenge of deconvoluting and linking each ligand back to the presenting MHC molecule(s) of the investigated cell lines

  • The NNAlign_MA Algorithm—The NNAlign_MA algorithm is an extension of the NNAlign neural network framework, and is capable of taking a mixed training set composed of singleallele data (SA, peptides assigned to single MHCs) and multiallele data (MA, peptides that are assigned to multiple MHCs), and fully deconvolute the individual MHC restriction of all MA peptides, learning the binding specificities of the MHCs present in the training set

Read more

Summary

Introduction

The set of peptides presented on a cell’s surface by MHC molecules is poly-specific (it contains multiple sequence motifs matching the quantity of MHC molecules expressed). NNAlign_MA can exploit this type of data, by means of: [1] clustering peptides into individual specificities; [2] automatic annotation of clusters to an MHC molecule; and [3] training of a prediction model covering all MHCs present in the training set. NNAlign_MA simultaneously performs the tasks of [1] clustering peptides into individual specificities; [2] automatic annotation of each cluster to an MHC molecule; and [3] training of a prediction model covering all MHCs present in the training set. Major Histocompatibility Complex (MHC) molecules play a central role in the cellular immune system as cell-surface presenters of antigenic peptides to T-cell receptors (TCR). The peptide-MHC complex (pMHC) is scrutinized by T cells and an immune response can be initiated if interactions between the pMHC and TCR are established

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call