Abstract

are then summarized as a bootstrap consensus tree. The frequency at which each clade is recovered is termed the bootstrap proportion, or bootstrap support. Jackknife analyses (reviewed by Miller, 1974) have also been used to estimate internal support on phylogenetic trees (e.g., Farris et al., 1996; Kallersjo et al., 1999). The jackknife differs from the bootstrap in that characters are sampled from the original data set without replacement to construct a replicate data set that has fewer characters than the original. Replicate jackknife data sets are subjected to phylogenetic analysis, and the results of these replicate searches are summarized as a consensus tree, yielding jackknife support values. Because a jackknifed data set is smaller than a bootstrapped data set, a jackknifed data set will typically contain less phylogenetic information than a bootstrapped one. The smaller size of the jackknifed data set leads to the expectation that the replicate searches will have fewer phylogenetically informative data for tree-building, and that jackknife values will generally be lower and more variable than bootstrap values. Furthermore, the percentage of characters deleted in a jackknife analysis can affect the jackknife values. Jackknifing with 33% character deletion will produce data sets that generally contain more phylogenetic information than those produced by jackknifing The use of bootstrap analyses as a method of assessing internal support on phylogenetic trees was first proposed by Felsenstein (1985), and since that time the application and usefulness of bootstrap analyses have been extensively discussed (e.g., Sanderson, 1989, 1995; Felsenstein and Kishino, 1993; Hillis and Bull, 1993). Although opinions vary, even a casual review of recent systematics journals reveals that bootstrap analyses have become a routine part of many phylogenetic analyses. The goal of this paper is not to discuss the merits or pitfalls of bootstrap analyses but rather to compare bootstrap and jackknife analyses for several aspects of performance. Selected for comparison are mean support values per node, speed, and repeatability. The bootstrap is a statistical method for obtaining an estimate of error (Efron 1979, 1982; Hedges, 1992). Bootstrapping, as applied during parsimony analyses, involves random sampling with replacement of a set of characters until a replicate data set of the same size as the original data set is constructed. This replicate data set is subsequently analyzed, and a phylogenetic tree is reconstructed according to a specified search strategy. This process is repeated a specified number of times, and the results Syst. Biol. 49(1):160–171, 2000 Points of View

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call