Abstract

BackgroundWhile occurring enzymatically in biological systems, O-linked glycosylation affects protein folding, localization and trafficking, protein solubility, antigenicity, biological activity, as well as cell-cell interactions on membrane proteins. Catalytic enzymes involve glycotransferases, sugar-transferring enzymes and glycosidases which trim specific monosaccharides from precursors to form intermediate structures. Due to the difficulty of experimental identification, several works have used computational methods to identify glycosylation sites.ResultsBy investigating glycosylated sites that contain various motifs between Transmembrane (TM) and non-Transmembrane (non-TM) proteins, this work presents a novel method, GlycoRBF, that implements radial basis function (RBF) networks with significant amino acid pairs (SAAPs) for identifying O-linked glycosylated serine and threonine on TM proteins and non-TM proteins. Additionally, a membrane topology is considered for reducing the false positives on glycosylated TM proteins. Based on an evaluation using five-fold cross-validation, the consideration of a membrane topology can reduce 31.4% of the false positives when identifying O-linked glycosylation sites on TM proteins. Via an independent test, GlycoRBF outperforms previous O-linked glycosylation site prediction schemes.ConclusionA case study of Cyclic AMP-dependent transcription factor ATF-6 alpha was presented to demonstrate the effectiveness of GlycoRBF. Web-based GlycoRBF, which can be accessed at http://GlycoRBF.bioinfo.tw, can identify O-linked glycosylated serine and threonine effectively and efficiently. Moreover, the structural topology of Transmembrane (TM) proteins with glycosylation sites is provided to users. The stand-alone version of GlycoRBF is also available for high throughput data analysis.

Highlights

  • While occurring enzymatically in biological systems, O-linked glycosylation affects protein folding, localization and trafficking, protein solubility, antigenicity, biological activity, as well as cell-cell interactions on membrane proteins

  • A membrane topology is considered for reducing the false positives on glycosylated TM proteins

  • When the structural topology on glycosylated TM proteins is considered, the prediction accuracy can reach 85.4%, subsequently reducing false positives by 31.4%

Read more

Summary

Introduction

While occurring enzymatically in biological systems, O-linked glycosylation affects protein folding, localization and trafficking, protein solubility, antigenicity, biological activity, as well as cell-cell interactions on membrane proteins. Due to the difficulty of experimental identification, several works have used computational methods to identify glycosylation sites. Protein glycosylation adds an oligosaccharide (chain of sugars) to a polypeptide (chain of amino acids) in order to produce a glycoprotein. Gupta et al [7] have proposed a web-based tool, named NetNGlyc, for identifying N-glycosylation sites in human proteins. Li et al [8] applied support vector machine (SVM) for predicting O-glycosylation sites in mammalian proteins. Chen et al [10] analyzed the k-spaced amino acid pairs of glycol-proteins and developed support vector machine based method to predict O-link glycosylation sites. Hamby and Hirst [11] adopted the random forests method, integrating frequencies of amino acids surrounding modified residue and significant pairwise patterns for predicting glycosylation site

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call