Abstract

The definition of $\text {k}^{th}$ -order empirical entropy of strings is extended to node-labelled binary trees: A notion of $\text {k}^{th}$ -order empirical entropy for node-labelled binary trees is proposed that is able to capture regularities in both labels and structure of a tree. A suitable binary encoding of tree straight-line programs (that have been used for grammar-based tree compression before) is shown to yield binary tree encodings of size bounded by the $\text {k}^{th}$ -order empirical entropy plus some lower order terms. This result is then extended from node-labelled binary trees to node-labelled unranked trees. This generalizes recent results for grammar-based string compression to grammar-based tree compression. Additionally, experimental results with real XML document trees are presented, in which the proposed notion of $\text {k}^{th}$ -order empirical tree entropy is computed and compared to the performance of grammar-based tree compressors for those XML document trees.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.