Abstract

BackgroundGlycobiology pertains to the study of carbohydrate sugar chains, or glycans, in a particular cell or organism. Many computational approaches have been proposed for analyzing these complex glycan structures, which are chains of monosaccharides. The monosaccharides are linked to one another by glycosidic bonds, which can take on a variety of comformations, thus forming branches and resulting in complex tree structures. The q-gram method is one of these recent methods used to understand glycan function based on the classification of their tree structures. This q-gram method assumes that for a certain q, different q-grams share no similarity among themselves. That is, that if two structures have completely different components, then they are completely different. However, from a biological standpoint, this is not the case. In this paper, we propose a weighted q-gram method to measure the similarity among glycans by incorporating the similarity of the geometric structures, monosaccharides and glycosidic bonds among q-grams. In contrast to the traditional q-gram method, our weighted q-gram method admits similarity among q-grams for a certain q. Thus our new kernels for glycan structure were developed and then applied in SVMs to classify glycans.ResultsTwo glycan datasets were used to compare the weighted q-gram method and the original q-gram method. The results show that the incorporation of q-gram similarity improves the classification performance for all of the important glycan classes tested.ConclusionThe results in this paper indicate that similarity among q-grams obtained from geometric structure, monosaccharides and glycosidic linkage contributes to the glycan function classification. This is a big step towards the understanding of glycan function based on their complex structures.

Highlights

  • Glycobiology pertains to the study of carbohydrate sugar chains, or glycans, in a particular cell or organism

  • Bioinformatics methods for glycobiology have recently developed rapidly due to the availability of glycan structure databases provided by major institutions including KEGG and the CFG (Consortium for Functional Glycomics)

  • Glycan structure data was retrieved from the KEGG/GLYCAN database [10] and their annotations are

Read more

Summary

Introduction

Glycobiology pertains to the study of carbohydrate sugar chains, or glycans, in a particular cell or organism. The monosaccharides are linked to one another by glycosidic bonds, which can take on a variety of comformations, forming branches and resulting in complex tree structures. The q-gram method is one of these recent methods used to understand glycan function based on the classification of their tree structures. This q-gram method assumes that for a certain q, different q-grams share no similarity among themselves. The structure at the leaves are understood to be important for various biological functions [1] These often consist of many branches and can become rather complex. One of the important bioinformatics techniques applied to glycans is support vector machines (SVMs) for the extraction of species-specific glycan substructures [2]. The application of tree kernels for glycan classification [4] was developed at the same time as the q-gram distribution kernel for diseasespecific glycan motif extraction [5]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.