Abstract

Methods for reconstructing evolutionary history are sensitive to the number and position of taxa included in the analysis (e.g., Gauthier et al., 1988; Hendy and Penny, 1989; Lecointre et al., 1993; Poe, 1998). Figure 1 illustrates this phenomenon using parsimony analyses of a sample of four Anolis lizard species from a large matrix of morphological and molecular data (Poe, 2001). Any of the three possible relationships for these species can be obtained by including appropriate additional species in the same analysis. In addition to demonstrating the instability of results relative to taxon sampling, this example shows conclusively that addition of taxa may be beneficial to the accuracy of a phylogenetic analysis but also may be detrimental to accuracy. This conclusion holds because it is possible to change any of the topological results either by adding or by subtracting taxa, and even though we do not know which of these three trees is the true tree, we can assume that one of them is correct and two of them are wrong. One might think that the sensitivity to taxon sampling shown in Figure 1 is restricted to certain methods or to poorly supported trees. Unfortunately this is not the case, as shown in the example in Figure 2. These trees were reconstructed using the mitochondrial DNA sequence data of Jackman et al. (1999) for Anolis lizards. Tree a in Figure 2 is obtained when these taxa are analyzed alone using maximum likelihood and minimum evolution under complex models (HKY + G; Hasegawa et al., 1985; Yang, 1994; parameter values estimated from data) and using parsimony with equal weights for all character changes. Tree b is obtained using these same methods but running the analyses including three other lizard species. This comparison shows that sensitivity to taxon sampling may occur even with strongly supported trees and diverse methods of estimation (and shows that high bootstrap values and strongly supported congruence between methods are not necessarily predictors of accuracy). The above examples show the potentially extreme sensitivity of phylogenetic methods to taxon sampling, but they are of little help in devising a taxon sampling strategy for maximizing the accuracy of a phylogenetic analysis. When a researcher is interested in the relationships of a set of clades from which exemplar taxa are chosen, is it better or worse to include additional taxa in the analysis? Clearly, simply including more taxa without additional character information can be detrimental to accuracy, because more characters are needed to resolve a greater number of nodes. However, addition of more taxa adds information about evolutionary history (e.g., Gauthier et al., 1988), which seems likely to have a positive effect on accuracy. Given these potentially opposing effects, what is the best taxon-sampling strategy for maximizing the accuracy of phylogenetic analyses? Phylogenies that include lineages that have undergone extensive evolution are difficult to reconstruct because of the phenomenon of long branch attraction (Felsenstein, 1978; Huelsenbeck and Hillis, 1993). Thus, a beneficial sampling strategy might involve shortening long branches by including additional taxa, assuming that such taxa exist (Hendy and Penny, 1989). This strategy has been evaluated for the parsimony method by Graybeal (1998) and by Poe and Swofford (1999). Graybeal fulfilled Hendy and Penny’s (1989) prediction that long-branch subdivision can have a strong beneficial effect on the accuracy of estimation of four-taxon trees in the Felsenstein zone of two long opposing branches and a short internal branch. Poe and Swofford (1999) examined a wider range of model trees and discovered several conditions of the kind discussed by Zharkikh and Li (1993) under which long-branch subdivision was detrimental to accuracy. The taxon sampling strategy of long-branch subdivision (LBS) has not been examined for methods other than parsimony. Poe and Swofford (1999) suggested that phylogenetic methods that take branch lengths into account are less likely to be affected by the problems of LBS that afflicted their application of the parsimony method. Pollock and Bruno (2000:1858) concluded that “the notion that added taxa can decrease accuracy . . . should be abandoned as an artifact of parsimony.” Although it seems likely that LBS will be beneficial when the model

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call