A probabilistic model for mining labeled ordered trees: capturing patterns in carbohydrate sugar chains

N Ueda,T Akutsu,A Yamaguchi,K.F Aoki-Kinoshita,H Mamitsuka

doi:10.1109/tkde.2005.117

Abstract

Glycans, or carbohydrate sugar chains, which play a number of important roles in the development and functioning of multicellular organisms, can be regarded as labeled ordered trees. A recent increase in the documentation of glycan structures, especially in the form of database curation, has made mining glycans important for the understanding of living cells. We propose a probabilistic model for mining labeled ordered trees, and we further present an efficient learning algorithm for this model, based on an EM algorithm. The time and space complexities of this algorithm are rather favorable, falling within the practical limits set by a variety of existing probabilistic models, including stochastic context-free grammars. Experimental results have shown that, in a supervised problem setting, the proposed method outperformed five other competing methods by a statistically significant factor in all cases. We further applied the proposed method to aligning multiple glycan trees, and we detected biologically significant common subtrees in these alignments where the trees are automatically classified into subtypes already known in glycobiology.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A probabilistic model for mining labeled ordered trees: capturing patterns in carbohydrate sugar chains

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering

Lead the way for us

Journal: IEEE Transactions on Knowledge and Data Engineering	Publication Date: Aug 1, 2005
Citations: 56

Similar Papers

Guest Editors' Introduction: Special Issue on Mining Biological Data
Wei Wang ... Jiong Yang
IEEE Transactions on Knowledge and Data Engineering | VOL. 17
Wei Wang, et. al. Wei Wang ... Jiong Yang
01 Aug 2005
IEEE Transactions on Knowledge and Data Engineering | VOL. 17

A new efficient probabilistic model for mining labeled ordered trees applied to glycobiology
Kosuke Hashimoto ... Hiroshi Mamitsuka
ACM Transactions on Knowledge Discovery from Data | VOL. 2
Kosuke Hashimoto, et. al.Kosuke Hashimoto ... Hiroshi Mamitsuka
01 Mar 2008
ACM Transactions on Knowledge Discovery from Data | VOL. 2

ProfilePSTMM: capturing tree-structure motifs in carbohydrate sugar chains
K F Aoki-Kinoshita ... M Kanehisa
Bioinformatics | VOL. 22
K F Aoki-Kinoshita, et. al.K F Aoki-Kinoshita ... M Kanehisa
15 Jul 2006
Bioinformatics | VOL. 22

Mining significant tree patterns in carbohydrate sugar chains
K Hashimoto ... H Mamitsuka
Bioinformatics | VOL. 24
K Hashimoto, et. al.K Hashimoto ... H Mamitsuka
09 Aug 2008
Bioinformatics | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A probabilistic model for mining labeled ordered trees: capturing patterns in carbohydrate sugar chains

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering