Abstract
Epitopes are immunogenic regions in antigen protein. Prediction of B-cell epitopes is critical for immunological applications. B-cell epitopes are categorized into linear and conformational. The majority of B-cell epitopes are conformational. Several machine learning methods have been proposed to identify conformational B-cell epitopes. However, the quality of these methods is not ideal. One question is whether or not the prediction of conformational B-cell epitopes can be improved by using ensemble methods. In this paper, we propose an ensemble method, which combined 12 support vector machine-based predictors, to predict the conformational B-cell epitopes, using an unbound dataset. AdaBoost and resampling methods are used to deal with an imbalanced labeled dataset. The proposed method achieves AUC of 0.642–0.672 on training dataset with 5-fold cross validation and AUC of 0.579–0.604 on test dataset. We also find some interesting results with the bound and unbound datasets. Epitopes are more accessible than non-epitopes, in bound and unbound datasets. Epitopes are also preferred in beta-turn, in bound and unbound datasets. The flexibility and polarity of epitopes are higher than non-epitopes. In a bound dataset, Asn (N), Glu (E), Gly (G), Lys (K), Ser (S), and Thr (T) are preferred in epitope regions, while Ala (A), Leu (L) and Val (V) are preferred in non-epitope regions. In the unbound dataset, Glu (E) and Lys (K) are preferred in epitope sites, while Leu (L) and Val (V) are preferred in non-epitiopes sites.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have