Abstract
Consensus trees are used in phylogenetics as summaries or representations of sets of source trees. Here we ask ‘How good are consensus trees?’ in the sense of how well do individual consensus trees represent the set of source trees for which they stand? There are many different consensus methods and various contexts in which they may be used (Swofford, 1991; Wilkinson, 1994; Leclerc, 1998). Consequently, answers to our question must be specic as to both method and context. For example, majority-rule consensus trees (Margush and McMorris, 1981; Wilkinson, 1996) can provide useful graphical summaries of bootstrap or jackknife analyses but can be problematic when used to represent a set of equally optimal trees from the analysis of a single data set (Wilkinson and Benton, 1996). Here we focus on strict consensus methods sensu Wilkinson (1994), that is, methods that require unanimous agreement among the source trees, and on the contexts in which they are commonly used. Contexts in which strict consensus methods are used include the representation of the set of optimal trees for a single data set, the comparison of simulated trees and trees inferred from simulated data, and the quantication of the similarity of trees derived from different data sets in studies of taxonomic congruence. Here we describe a simple measure of consensus efciency that allows us to say how well a particular strict consensus tree is doing its job of faithfully representing the source trees. Consensus methods differ in the type of information they represent and the level of agreement required among the source trees for information of that type to be included in the consensus tree (Page, 1992). This is reected in the consensus terminology of Wilkinson (1994), as we use here, in which the names of consensus methods combine descriptors of the type of information (e.g., component, Adams) and the level of agreement (e.g., strict, majority-rule). Strict consensus trees provide information by permitting (or, conversely, prohibiting) a subset of the possible trees (Page, 1992; Wilkinson, 1994; Thorley et al., 1998). Consensus efciency is a relation between the trees permitted by the consensus tree and the source trees. An ideal or maximally efcient strict consensus tree would permit only the source trees that it represents. Consensus trees might deviate from the ideal in two ways. First, they might permit trees that are not source trees, and second they might fail to permit some of the source trees. Both behaviors would reduce the correspondence between the consensus and the source trees the consensus is intended to represent and thereby would reduce the efciency of the consensus tree. In practice, strict consensus trees must permit all the source trees. Thus the efciency of consensus trees is maximal when it permits only the source trees and is reduced as it permits additional trees. A maximally inefcient consensus representation is a consensus tree that prohibits no trees (i.e., a bush) when the set of source trees does not include all possible binary trees. A measure of consensus efciency (CE) that has these properties and that ranges between values of zero (minimal efciency) and one (maximal efciency) is given by:
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have