Abstract

The tree edit distance is one of the most widely used measures for comparison of tree structured data and has been used for analysis of RNA secondary structures, glycan structures, and vascular trees. However, it is known that the tree edit distance problem is NP-hard for unordered trees while it is polynomial time solvable for ordered trees. We have recently proposed a clique-based method for computing the tree edit distance between unordered trees in which each instance of the tree edit distance problem is transformed into an instance of the maximum vertex weighted clique problem and then an existing clique algorithm is applied. In this paper, we propose an improved clique-based method. Different from our previous method, the improved method is basically a dynamic programming algorithm that repeatedly solves instances of the maximum vertex weighted clique problem as sub-problems. Other heuristic techniques, which do not violate the optimality of the solution, are also introduced. When applied to comparison of large glycan structures, our improved method showed significant speed-up in most cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call