Abstract

BackgroundComputational methods for the prediction of Major Histocompatibility Complex (MHC) class II binding peptides play an important role in facilitating the understanding of immune recognition and the process of epitope discovery. To develop an effective computational method, we need to consider two important characteristics of the problem: (1) the length of binding peptides is highly flexible; and (2) MHC molecules are extremely polymorphic and for the vast majority of them there are no sufficient training data.MethodsWe develop a novel string kernel MHC2SK (MHC-II String Kernel) method to measure the similarities among peptides with variable lengths. By considering the distinct features of MHC-II peptide binding prediction problem, MHC2SK differs significantly from the recently developed kernel based method, GS (Generic String) kernel, in the way of computing similarities. Furthermore, we extend MHC2SK to MHC2SKpan for pan-specific MHC-II peptide binding prediction by leveraging the binding data of various MHC molecules.ResultsMHC2SK outperformed GS in allele specific prediction using a benchmark dataset, which demonstrates the effectiveness of MHC2SK. Furthermore, we evaluated the performance of MHC2SKpan using various benckmark data sets from several different perspectives: Leave-one-allele-out (LOO), 5-fold cross validation as well as independent data testing. MHC2SKpan has achieved comparable performance with NetMHCIIpan-2.0 and outperformed NetMHCIIpan-1.0, TEPITOPEpan and MultiRTA, being statistically significant. MHC2SKpan can be freely accessed at http://datamining-iip.fudan.edu.cn/service/MHC2SKpan/index.html.

Highlights

  • Computational methods for the prediction of Major Histocompatibility Complex (MHC) class II binding peptides play an important role in facilitating the understanding of immune recognition and the process of epitope discovery

  • MHC2SK outperformed general string (GS) in the allele-specific prediction task on a benchmark dataset, which demonstrates the effectiveness of MHC2SK

  • Experimental procedure and evaluation metrics The prediction model was learned by the support vector regression (SVR) algorithm

Read more

Summary

Introduction

Computational methods for the prediction of Major Histocompatibility Complex (MHC) class II binding peptides play an important role in facilitating the understanding of immune recognition and the process of epitope discovery. To develop an effective computational method, we need to consider two important characteristics of the problem: (1) the length of binding peptides is highly flexible; and (2) MHC molecules are extremely polymorphic and for the vast majority of them there are no sufficient training data. This results in a large length variation of of binding peptides (usually 11-20 amino acids) [19] Several computational methods, such as TEPITOPE [9], SMM-align [4] and NN-align [5], try to locate the binding core of a peptide in the modeling process, which is a nonamer sitting in the binding groove of MHC molecules. The basic idea of pan-specific methods is to identify the relationship among MHC alleles so that the binding preferences of target MHC molecules can be captured

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.