Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes.

Weilong Zhao,Xinwei Sher

doi:10.1371/journal.pcbi.1006457

Weilong Zhao, Xinwei Sher

Open Access

https://doi.org/10.1371/journal.pcbi.1006457

Copy DOI

Journal: PLOS Computational Biology	Publication Date: Nov 8, 2018
Citations: 139	License type: CC BY 4.0

Affiliation: MSD (United States)

Abstract

A number of machine learning-based predictors have been developed for identifying immunogenic T-cell epitopes based on major histocompatibility complex (MHC) class I and II binding affinities. Rationally selecting the most appropriate tool has been complicated by the evolving training data and machine learning methods. Despite the recent advances made in generating high-quality MHC-eluted, naturally processed ligandome, the reliability of new predictors on these epitopes has yet to be evaluated. This study reports the latest benchmarking on an extensive set of MHC-binding predictors by using newly available, untested data of both synthetic and naturally processed epitopes. 32 human leukocyte antigen (HLA) class I and 24 HLA class II alleles are included in the blind test set. Artificial neural network (ANN)-based approaches demonstrated better performance than regression-based machine learning and structural modeling. Among the 18 predictors benchmarked, ANN-based mhcflurry and nn_align perform the best for MHC class I 9-mer and class II 15-mer predictions, respectively, on binding/non-binding classification (Area Under Curves = 0.911). NetMHCpan4 also demonstrated comparable predictive power. Our customization of mhcflurry to a pan-HLA predictor has achieved similar accuracy to NetMHCpan. The overall accuracy of these methods are comparable between 9-mer and 10-mer testing data. However, the top methods deliver low correlations between the predicted versus the experimental affinities for strong MHC binders. When used on naturally processed MHC-ligands, tools that have been trained on elution data (NetMHCpan4 and MixMHCpred) shows better accuracy than pure binding affinity predictor. The variability of false prediction rate is considerable among HLA types and datasets. Finally, structure-based predictor of Rosetta FlexPepDock is less optimal compared to the machine learning approaches. With our benchmarking of MHC-binding and MHC-elution predictors using a comprehensive metrics, a unbiased view for establishing best practice of T-cell epitope predictions is presented, facilitating future development of methods in immunogenomics.

Highlights

The increasing wealth of immunogenomic information generated by next-generation sequencing (NGS) technologies is boosting the application of cancer immunotherapy that takes full advantage of individual’s adaptive immunity by administrating personalized cancer vaccines. [1,2,3] An essential step in provoking adaptive immunity, delivered by the activated CD8+ or CD4+ T cells, is the recognition of T cell receptor (TCR) to T cell epitopes.[4]
We demonstrate that recent advance in incorporating high-quality naturally presented peptide data from mass spectrometry experiments has improved the accuracy
Our benchmarking of machine learning predictors for major histocompatibility complex (MHC)-binding and MHC-naturally presented antigen peptides contributes to establishing best practice of computational T-cell epitope analysis, which has implication in tumor neoantigen-based cancer vaccine discovery

Summary

Introduction

The increasing wealth of immunogenomic information generated by next-generation sequencing (NGS) technologies is boosting the application of cancer immunotherapy that takes full advantage of individual’s adaptive immunity by administrating personalized cancer vaccines. [1,2,3] An essential step in provoking adaptive immunity, delivered by the activated CD8+ or CD4+ T cells, is the recognition of T cell receptor (TCR) to T cell epitopes.[4]. The increasing wealth of immunogenomic information generated by next-generation sequencing (NGS) technologies is boosting the application of cancer immunotherapy that takes full advantage of individual’s adaptive immunity by administrating personalized cancer vaccines. [1,2,3] An essential step in provoking adaptive immunity, delivered by the activated CD8+ or CD4+ T cells, is the recognition of T cell receptor (TCR) to T cell epitopes.[4] As sequence repertoire for potential TCR-recognizing epitopes, whole exome or transcriptome from pathogens or tumor cells can be analyzed by bioinformatics pipelines to identify vaccine candidates.[5,6] Among various processes related to antigen presentation, the binding of antigen peptides to MHC proteins is considered to be the major determinant. While all serving the purpose of MHC-binding prediction in general, the increasing method variations among these tools, in combination with the emerging new types of experimental data, render it necessary to rationally select the best approach, especially for the potential applications in cancer vaccine design

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Increased Diversity of the HLA-B40 Ligandome by the Presentation of Peptides Phosphorylated at Their Main Anchor Residue
Miguel Marcilla ... Juan Pablo Albar
Molecular & Cellular Proteomics | VOL. 13
Miguel Marcilla, et. al.Miguel Marcilla ... Juan Pablo Albar
01 Feb 2014
Molecular & Cellular Proteomics | VOL. 13

Molecular characterisation of the monocytic cell line THP‐1 demonstrates a discrepancy with the documented HLA type
Richard Battle ... Sarah Haywood‐Small
International Journal of Cancer | VOL. 132
Richard Battle, et. al.Richard Battle ... Sarah Haywood‐Small
26 Jun 2012
International Journal of Cancer | VOL. 132

Dendritic Cells Loaded With mRNA Encoding Full-length Tumor Antigens Prime CD4+ and CD8+ T Cells in Melanoma Patients
An Mt Van Nuffel ... Aude Bonehill
Molecular Therapy | VOL. 20
An Mt Van Nuffel, et. al.An Mt Van Nuffel ... Aude Bonehill
01 May 2012
Molecular Therapy | VOL. 20

Multiple, Non-conserved, Internal Viral Ligands Naturally Presented by HLA-B27 in Human Respiratory Syncytial Virus-infected Cells
Susana Infantes ... Daniel López
Molecular & Cellular Proteomics | VOL. 9
Susana Infantes, et. al.Susana Infantes ... Daniel López
01 Jul 2010
Molecular & Cellular Proteomics | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS Computational Biology