There are a number of measures of degrees of similarity between rooted binary trees. Many of these ignore sections of the trees which are in complete agreement. We use computational experiments to investigate the statistical characteristics of such a measure of tree similarity for ordered, rooted, binary trees. We generate the trees used in the experiments iteratively, using the Yule process modeled upon speciation. Rooted binary trees arise in a wide range of settings, from biological evolutionary trees to efficient structures for searching datasets. There are a number of measures of tree similarity which arise in these settings. Here we investigate a measure which is relevant for ordered, rooted, binary trees of the same size. Examples of trees satisfying such conditions include some binary search trees. Our approach is to consider pairs of such trees of increasing size n, selected via a random process, and investigate the degree of commonality given by a natural measure of the degree to which they agree completely on peripheral subtrees. Using experimental evidence, we find that the degree of commonality appears to grown linearly with tree size, and we estimate the average behavior. There are a number of processes for selecting trees randomly. One method that is commonly studied is the uniform distribution on trees, where each tree is equally likely to be selected. Some properties of the reduction behavior of trees selected uniformly at random have been investigated by Cleary, Elder, Rechnitzer and Taback [Cleary et al. 2010] while studying statistical properties of Thompson’s group F, showing that a tree pair selected from the uniform distribution on tree pairs is almost surely unreduced in the sense described below. The common subtrees investigated here via reduction are a particular case of common edges, where in the
Read full abstract