Abstract

Epitopes are antigenic determinants that are useful because they induce B-cell antibody production and stimulate T-cell activation. Bioinformatics can enable rapid, efficient prediction of potential epitopes. Here, we designed a novel B-cell linear epitope prediction system called LEPS, Linear Epitope Prediction by Propensities and Support Vector Machine, that combined physico-chemical propensity identification and support vector machine (SVM) classification. We tested the LEPS on four datasets: AntiJen, HIV, a newly generated PC, and AHP, a combination of these three datasets. Peptides with globally or locally high physicochemical propensities were first identified as primitive linear epitope (LE) candidates. Then, candidates were classified with the SVM based on the unique features of amino acid segments. This reduced the number of predicted epitopes and enhanced the positive prediction value (PPV). Compared to four other well-known LE prediction systems, the LEPS achieved the highest accuracy (72.52%), specificity (84.22%), PPV (32.07%), and Matthews' correlation coefficient (10.36%).

Highlights

  • Called antigenic determinants, are clusters of amino acid segments located on the surfaces of an antigen

  • We developed a novel B-cell linear epitope (LE) prediction system called LEPS (Linear Epitope Prediction by Propensities and Support Vector Machine)

  • To evaluate the performance of the LEPS at the level of the amino acid residue, five indicators were used to measure effectiveness at the default settings. These indicators were (1) sensitivity (SEN), defined as the percentage of epitopes that were correctly predicted as epitopes; (2) specificity (SPE), defined as the percentage of non-epitopes that were correctly predicted as non-epitopes; (3) positive predictive value (PPV), defined as the probability that a predicted epitope was, an epitope; (4) accuracy (ACC), defined as the proportion of correctly predicted peptides; (5) Matthews’ correlation coefficient (MCC), which was a measure of the predictive performance that incorporated both SEN and SPE into a single value between −1 and +1 [26]

Read more

Summary

Introduction

Called antigenic determinants, are clusters of amino acid segments located on the surfaces of an antigen. BepiPred combined a hydrophilicity scale with a hidden Markov model [20]; BCPred [21] and FBCPred [22] employed SVM with a subsequence kernel; Sollner and Mayer utilized a molecular operating environment with the decision tree and nearest neighbour approaches [6] These machine learning approaches were mostly set to predict peptides of fixed lengths. To overcome the drawbacks of training and/or predicting fixed length epitopes, ABCPred used two artificial neural network methods, the feed-forward network and the recurrent neural network, for the prediction of B-cell LEs [26] Both networks were used with different window lengths from 10 to 20 amino acids and a two-residue interval.

Materials and Methods
Epitope
A New Linear Epitope Dataset
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call