Abstract

Polystyrene binding peptides (PSBPs) play a key role in the immobilization process. The correct identification of PSBPs is the first step of all related works. In this paper, we proposed a novel support vector machine-based bioinformatic identification model. This model contains four machine learning steps, including feature extraction, feature selection, model training and optimization. In a five-fold cross validation test, this model achieves 90.38, 84.62, 87.50, and 0.90% SN, SP, ACC, and AUC, respectively. The performance of this model outperforms the state-of-the-art identifier in terms of the SN and ACC with a smaller feature set. Furthermore, we constructed a web server that includes the proposed model, which is freely accessible at http://server.malab.cn/PSBP-SVM/index.jsp.

Highlights

  • The immobilization of a biological functional molecule on a solid surface is one of the most important topics in the field of biology

  • The frequency feature can be calculated via the single amino acid composition (AAC), the dipeptide composition (DPC), three or more peptides’ composition or peptides with a certain gap

  • The 420-dimensional AAC and DPC feature is generated from the above benchmark dataset in the feature extraction step

Read more

Summary

INTRODUCTION

The immobilization of a biological functional molecule on a solid surface is one of the most important topics in the field of biology. Machine learning algorithms have been widely used in biological sequence recognition (Wang et al, 2008, 2010, 2018; Zhou et al, 2017, 2018, 2019; He et al, 2018; Liao et al, 2018; Xu et al, 2018a,b; Bao et al, 2019; Cheng et al, 2019; Ding et al, 2019; Fang et al, 2019; Jin et al, 2019; Liu et al, 2019a; Meng et al, 2019; Shen et al, 2019; Zhu et al, 2019) This process generally includes data collection, feature extraction, feature selection and model training. In the “Conclusion” section, we analyze the shortcomings of the model and look forward to its future improvement

MATERIALS AND METHODS
Evaluation Measurement
RESULTS AND DISCUSSION
CONCLUSION
DATA AVAILABILITY STATEMENT
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.