Abstract

Trees have been topics of much interest since many decades due to various emerging applications using data represented as trees. Several techniques have been developed to compare two trees. But there is a serious lack of metrics to compare weighted trees. Existing approaches do not also allow to explicitly specify the targeted nodes properties on which the comparison should be performed. Furthermore, the problem of comparing two tree sets is not specifically addressed by existing techniques. This paper attempts to solve these problems by first proposing a distance and a similarity for the comparison of two finite sets of rooted ordered trees which can be labeled or not, as well as weighted or unweighted. To achieve this goal, a hidden Markov model is associated with each tree set for each targeted nodes property. The model associated with a tree set T for the targeted nodes property p learns how much the nodes of the trees in T verify property p. The resulting models are finally compared to derive a distance and similarity between the two sets of trees. The previous measures are then generalized for the comparison of unrooted and unordered trees. Flat classification experiments were carried out on two synthetic databases named FirstLast-L and FirstLast-LW available online. They both contain four classes of 100 rooted ordered trees whose specific and non-trivial nodes properties are clearly defined. When the distance proposed in this paper is selected as metric for the Nearest Neighbor classifier, a perfect accuracy of $$100\%$$ is obtained for these two databases. This performance is $$41\%$$ higher than the accuracy exhibited when the widespread tree Edit distance is selected for FirstLast-L.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.