Abstract

BackgroundAcetylation on lysine is a widespread post-translational modification which is reversible and plays a crucial role in some biological activities. To better understand the mechanism, it is necessary to identify acetylation sites in proteins accurately. Computational methods are popular because they are more convenient and faster than experimental methods. In this study, we proposed a new computational method to predict acetylation sites in human by combining sequence features and structural features including physicochemical property (PCP), position specific score matrix (PSSM), auto covariation (AC), residue composition (RC), secondary structure (SS) and accessible surface area (ASA), which can well characterize the information of acetylated lysine sites. Besides, a two-step feature selection was applied, which combined mRMR and IFS. It finally trained a cascade classifier based on SVM, which successfully solved the imbalance between positive samples and negative samples and covered all negative sample information.ResultsThe performance of this method is measured with a specificity of 72.19% and a sensibility of 76.71% on independent dataset which shows that a cascade SVM classifier outperforms single SVM classifier.ConclusionsIn addition to the analysis of experimental results, we also made a systematic and comprehensive analysis of the acetylation data.

Highlights

  • Acetylation on lysine is a widespread post-translational modification which is reversible and plays a crucial role in some biological activities

  • Combination features get a higher performance on Sn, Sp, Acc and Matthew’s correlation coefficient (MCC) than sequence features, which indicates that structural features is significant and useful in prediction

  • In this study, we implement an application of cascade classifier to human protein acetylation prediction problem, combining sequence features and structural features

Read more

Summary

Introduction

Acetylation on lysine is a widespread post-translational modification which is reversible and plays a crucial role in some biological activities. We proposed a new computational method to predict acetylation sites in human by combining sequence features and structural features including physicochemical property (PCP), position specific score matrix (PSSM), auto covariation (AC), residue composition (RC), secondary structure (SS) and accessible surface area (ASA), which can well characterize the information of acetylated lysine sites. A two-step feature selection was applied, which combined mRMR and IFS It trained a cascade classifier based on SVM, which successfully solved the imbalance between positive samples and negative samples and covered all negative sample information. As a widespread type of protein post-translational modifications (PTMs), acetylation on lysine plays a significant role in various organisms. There are at least a dozen of additional computational programs developed in earlier studies for the prediction of lysine acetylation sites, such as AceK, ASEB, BPBPHKA, EnsemblePail, iPTMmLys, KAcePred, KA-predictor, LAceP, LysAcet, NAce, PLMLA, PSKAcePred and SSPKA [5,6,7,8,9,10,11,12,13,14,15,16,17]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call