Abstract

MotivationPosttranslational modification (PTM) is a biological mechanism involved in the enzymatic modification of proteins after translation by ribosomes. Two or more modifications occurring at one residue can be transformed into a multi-label system. Two or more simultaneous modifications on a residue is more common than single PTMs. Lysine residues in proteins can be subjected to a variety of PTMs, such as ubiquitination, acetylation, sumoylation, methylation, and succinylation. Identification of uncharacterized sequences in proteins is a highly significant and state-of-the-art issue. Notably, in order to provide a method of processing multi-label sequences of lysine residues, it is highly desirable to develop computational methods to predict lysine acetylation and sumoylation modifications. ResultsIn this paper, we first launched an integrated approach, known as the five-step prediction method (FSPM), to solve the problem effectively by (1) using one-sided selection (OSS) to deal with imbalanced data, (2) extracting binary features from protein sequences, (3) incorporating binary relevance, classifier chains and multi-class transformation methods to simplify multi-label problems, (4) constructing different classifiers, and (5) implementing cross-validation and evaluating these classifiers. In 10-fold cross-validation, FSPM achieved an accuracy of 61.49% and an absolute-true rate of 60.17%. The results showed that FSPM is accurate and could be used as a powerful engine in multi-label systems. We also conducted a variety of statistical analyses of the predicted results to discuss the biological functions of lysine acetylation and sumoylation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call