Abstract

Protein Post Translation Modification identification is one of the important steps in conducting disease-associated mutation studies. Though multiple chemical alterations happen in a protein after translation, the addition of succinyl group to lysine residue plays a vital role in regulating cellular metabolism and thus disease. Use of a classification algorithm on some features, driven either from protein structural, physicochemical or even biochemical information becomes a common approach that can yield a satisfactory result up to a certain level. Although, researchers already developed many computational methods to identify whether a lysine residue modified with succinyl group after translation, most of them focused on the improvement either on a single decision using a single method or feature enrichment or even development of a benchmark dataset. Therefore, there still exists scope for further improvement to characterise lysine residues of a protein sequence by considering multiple predictors at a time. In this study, an ensemble based approach called DV-iSucLys has been designed to characterise the lysine residue by adapting three well known and conceptually different classifiers and ensembling their decisions. Also, a benchmark succinylation dataset was extracted from existing benchmark datasets and recently updated succinylation data from UniProt consortium to investigate the performance of the proposed approach as well as contribute to further research. Analysing rigorous cross-validation results show that DV-iSucLys can characterise succinyl lysine residue better than the existing predictors.

Highlights

  • A small number of genes (20,000 ̶ 25,000) operate human life by encoding multiple proteins from single gene

  • To construct a robust benchmark dataset, experimentally validated protein sequences with lysine succinylation site details were collected from SwissProt/UniProt [17] and Compendium of Protein Lysine Modifications database curated by CUCKOO Workgroup [18]

  • A computationally simple lysine residue characterisation approach has been evaluated with the primary motivation to combine conceptually different classifiers using majority voting rule, with the target to balance out their individual weakness

Read more

Summary

Introduction

A small number of genes (20,000 ̶ 25,000) operate human life by encoding multiple proteins from single gene. Among different mechanism of genetic code expedition, protein post-translational modification (PTM) is one of the most significant biological processes which extends the functional diversity of the proteome by the covalent addition of functional groups or proteins, proteolytic cleavage of regulatory subunits or degradation of entire proteins. Among evolutionary conserved PTMs, Lysine succinylation is one of them which was first discovered to occur at the active site of homoserine trans-succinylase [2] and available in both eukaryotes and prokaryotes. Identification of succinylation sites is considered as the most challenging and crucial topics for the researchers, for addressing the mechanism and function of protein succinylation which is very useful for both biomedical research and drug development and for the availability of enormous amount of protein sequence data by blessings of genome projects

Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.