Abstract

Fundamental considerations of phylogenetic analysis are reviewed in the context of treating large molecular matrices. While molecular data sets with dozens or hundreds of taxa are increasingly common in phylogenetic inference studies, several computational issues, some unique to such large matrices, others general in phylogenetic inference, nonetheless confront molecular systematists. The most controversial of these, choice among phylogenetic inference methods, bears directly on the analysis of molecular data sets. Maximum likelihood methods have been implemented exclusively for molecular data, but their burdensome computational load becomes acute as the number of taxa being analyzed grows. While there are several reasons to prefer parsimony to maximum likelihood generally, the unfeasibility of using likelihood to treat matrices with many terminals and the desirability of combining morphological and molecular under simultaneous analysis lead to a preference for parsimony more or less by default. Terminal selection and the coding of subset polymorphisms and inapplicable character data are of no less critical concern to molecular systematists than to morphologists. Shortcuts such as collapsing taxa to form “composite” terminals should be viewed with caution. Measures of nodal support, all of which are problematic in one or more ways, may be computationally prohibitive for large matrices. The relatively novel technique of parsimony jacknifing may provide a desirable means of evaluating the robustness of phylogenetic inference, especially as the generation of sequence data becomes increasingly routine.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call