Abstract

Phylogeny is central to the understanding of biodiversity and evolutionary processes. However, elucidating phylogenetic relationships in many groups has remained problematic due to their sheer size. The feasibility of phylogenetic analyses of large data sets has been questioned on both theoretical and empirical grounds. Some have suggested that large data sets be broken into a series of smaller problems for phylogenetic analysis. However, recent empirical studies and critical developments in methods of data analysis indicate that large data sets are tractable. We have learned a great deal about the analysis of large data sets via the angiosperms, for which three large molecular data sets have been constructed (plastid atpB and rbcL and nuclear 18S rDNA). We discuss three approaches successfully applied in our analyses of these large data sets. Parsimony analyses of separate and combined data sets representing hundreds of taxa indicate that “bigger is better.” That is, both empirical and simulation studies demonstrate that two solutions to dilemmas posed by large data sets is the addition of taxa as well as characters. Recent developments in software also greatly facilitate the parsimony analysis of large data sets. Applications such as NONA and the RATCHET can retrieve shorter trees than found by PAUP, and in much shorter run times. The recent development of “quick search” methods such as the fast bootstrap and fast jackknife are also of great utility in the analysis of large data sets. These methods are rapid and emphasize only those clades with strong support. All three of these approaches have recently been applied to a 567-taxon data set for angiosperms based on atpB, rbcL, and 18S rDNA sequences (a total of 4733 bp/taxon). Analyses of the combined three-gene data set have yielded the best-resolved and best-supported topology to date for angiosperms, with virtually all major clades, as well as the spine of the tree, well supported. These developments indicate that the phylogenetic analysis of large data sets is not only feasible, but relatively straightforward.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.