Abstract
Rooted trees are ubiquitous data structures which are used to model hierarchical objects from a plethora of different application domains. For various downstream analysis tasks, measures are needed that quantify (dis-)similarity between rooted trees. Many such measures exist, e. g., the widely used tree edit distance (TED). However, there are few algorithms to compute (dis-)similarity measures which are specifically designed for rooted, unordered, node-labeled trees and support input trees of different orders. To close this gap in the literature, we introduce the edge-preservation similarity (EPS). We show how to exactly compute EPS via integer quadratic programming on small instances and present a scalable 4-approximation algorithm. An evaluation on tree representations of pseudoknotted RNA secondary structures and acyclic molecular graphs shows that both exact and approximate (normalized) EPS better preserves functional similarities between the compared RNAs and molecules than the often-used TED. Python implementations of our algorithms and scripts to reproduce the results are available on GitHub: https://github.com/bionetslab/edge-preservation-similarity.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.