Abstract

Trees form an expanded family of combinatorial objects that offers a wide range of application fields, from plant modeling to XML files analysis through study of lineage trees. One can also mention classification trees that arise in statistical learning, as well as game trees. In all these examples, the underlying structure is a rooted tree which can be ordered (if the order of siblings is significant) or not. For the sake of modeling, the vertices of a tree might be augmented with labels describing their properties: for example, in a cell lineage tree, each vertex represents a cell and the labels might describe its volume, the tissue type, etc. Despite the diversity of applications, one may want a unique data structure to encode rooted trees and perform operations on them. Treex is a Python package for manipulating rooted trees, ordered or not, with or without labels on their vertices. Basically, Treex allows (i) random generation of trees, (ii) edit operations (e.g., add or remove vertices or labels), (iii) visualization of structures and their properties (in command line or a Matplotlib figure and exportation to TeX), (iv) conversion to different formats, and (v) application of various algorithms. Concerning the latter two, coding processes (Pitman, 2006) have been implemented, as well as DAG compression (Godin & Ferraro, 2010). In addition, comparison between trees can be performed via an edit distance algorithm (Azais, Durand, & Godin, 2019). Self-nested approximations of trees (Godin & Ferraro, 2010, Azais (2017), Azais et al. (2019)) have been implemented through different algorithms. To the best of our knowledge, treex is the only Python library that permits encoding of rooted trees in various formats together with such a diversity of treatments. Let us mention the related Java library TED (Pawlik & Augsten, 2016) for efficient edit distance algorithms. Treex offers converters to the standard encoding of nested brackets (see for instance (Aho, Hopcroft, & Ullman, 1974)) and L-strings as manipulated by L-Py, a simulation framework for modeling plant architectures (Boudon, Pradal, Cokelaer, Prusinkiewicz, & Godin, 2012). Numerical experiments and/or figures of recent publications (Azais, Genadot, & Henry, 2019, Azais (2017), Azais et al. (2019)) have been made using the current or previous versions of treex. Furthermore, ongoing academic projects on the development and implementation of supervised classification methods for tree data, the study of lineage trees, as well as investigations on plant modeling, make intensive use of structures and algorithms implemented in treex. Treex is open source and distributed under the LGPL License. The source code is hosted on GitLab. Releases are automatically built and tested for 64-bit Linux and Mac OS X machines using Jenkins CI, and packaged on Anaconda Cloud (Azais, Cerutti, Gemmerle, & Ingels, 2019a). The documentation of all classes, methods and functions is built upon release and made available online (Azais, Cerutti, Gemmerle, & Ingels, 2019b).

Highlights

  • Trees form an expanded family of combinatorial objects that offers a wide range of application fields, from plant modeling to XML files analysis through study of lineage trees

  • For the sake of modeling, the vertices of a tree might be augmented with labels describing their properties: for example, in a cell lineage tree, each vertex represents a cell and the labels might describe its volume, the tissue type, etc

  • Comparison between trees can be performed via an edit distance algorithm (Azais, Durand, & Godin, 2019)

Read more

Summary

Introduction

Trees form an expanded family of combinatorial objects that offers a wide range of application fields, from plant modeling to XML files analysis through study of lineage trees. The underlying structure is a rooted tree which can be ordered (if the order of siblings is significant) or not. For the sake of modeling, the vertices of a tree might be augmented with labels describing their properties: for example, in a cell lineage tree, each vertex represents a cell and the labels might describe its volume, the tissue type, etc.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.