Abstract

Tree-structured data are popular in many domains making structural classification an important task. In this paper, an associative classification method is introduced based on a structure preserving flat representation of trees. A major difference to traditional tree mining techniques is that subtrees are constrained by the position in the original trees, leading to a drastic reduction in the number of rules generated, especially with data having great structural variation among tree instances. This characteristic would be desired in the current status of frequent pattern mining, where excessive patterns hinder the practical use of results. However the question remains whether this reduction comes at a high cost in accuracy and coverage rate reduction. We explore this aspect and compare the approach with a state-of-the-art structural classifier based on same subtree type, but not positional constrained in any way. We investigate the effect of using different types of frequent pattern (frequent or closed), or subtree types (induced, embedded or embedded-plus-disconnected subtrees) to the performance of the two classifiers. Different rule strength measures such as confidence, weighted confidence and likelihood are also examined in our study. The experiments on three real-world data sets reveal important similarities and differences between the methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.