Abstract

Antimicrobial peptides (AMPs) are peptide antibiotics with a broad spectrum of antimicrobial activities. Activity prediction of AMPs from their amino acid sequences is of great therapeutic importance but imposes challenges on prediction methods due to label interactions. In this paper we propose a novel multi-label learning model to address this problem. A weighted K-nearest neighbor classifier is adopted for efficient representation learning of the sequence data. A multiple linear regression model is then employed to learn a mapping from the classifier score vectors to the target labels, with label correlations considered. Several popular multi-label learning algorithms and feature extraction methods were tested on a comprehensive, up-to-date AMP dataset with twelve biological activities covered and its filtered version with five activities covered. The experimental results showed that our proposed method has competitive performance with previous works and could be used as a powerful engine for activity prediction of AMPs.

Highlights

  • Antimicrobial peptides (AMPs) are peptide antibiotics with a broad spectrum of antimicrobial activities

  • Classifier Chains (CC) is another way to transform the multi-label learning into traditional single-label classification, which establishes multiple binary classifiers as Binary Relevance (BR), but the prediction of the subsequent classifier will be affected by the output of the preceding one, in such a way, the label correlation is considered in the classifier chains

  • There have been many bioinformatics tools with good ability proposed for identifying a peptide sequence as AMP or not, some of them can obtain the testing accuracy of more than 90%6,8,12,13

Read more

Summary

Introduction

Antimicrobial peptides (AMPs) are peptide antibiotics with a broad spectrum of antimicrobial activities. A multiple linear regression model is employed to learn a mapping from the classifier score vectors to the target labels, with label correlations considered. Classifier Chains (CC) is another way to transform the multi-label learning into traditional single-label classification, which establishes multiple binary classifiers as BR, but the prediction of the subsequent classifier will be affected by the output of the preceding one, in such a way, the label correlation is considered in the classifier chains. The label correlation is considered in a very simple yet effective way in which we neither need to create many new class labels like the LP or RAkEL method, nor need to construct chains of classifiers like CC or ECC, which is very time-consuming. Experiments on the newly constructed AMP dataset will demonstrate the superiority of the proposed method

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call