Abstract
Large-scale phylogenetic analyses involving thousands of rRNA sequences are complicated due to length variability which compounds the already complex problem of large tree searches. Here, we generated a large data matrix and test phylogenetic procedures for large-scale analysis in the Coleoptera (beetles), as a resource for evolutionary biology and identification of this hugely diverse group. The analysis included nearly 1200 species, including representatives of 126 (75%) families, all 18 superfamilies of Polyphaga, and the four suborders. Alignments were obtained by a fragment-extension method derived from the BLAST algorithm using the BlastAlign script [Belshaw, R., Katzourakis, A., 2005. BlastAlign: a program that uses blast to align problematic nucleotide sequences. Bioinformatics 21, 122–123], followed by fast parsimony and maximum likelihood searches. Trees were assessed against the existing classification, using a formal procedure for coding the hierarchical position of taxa and establishing taxonomic congruence. We found that the BlastAlign procedure greatly exceeded the performance of standard progressive alignment methods such as Clustal. The resulting trees, when used as guide tree, also greatly improved the Clustal-based alignments. Long-branch attraction potentially affecting the quality of the tree was reduced by the systematic removal of all branches longer than a 95% interval of the distribution of branch lengths. We applied this protocol to the test for monophyly of major proposed lineages of Coleoptera, including Crowson’s 18 superfamilies in the hyperdiverse suborder Polyphaga. While searches for very large trees remained challenging and details of the tree topology were not always satisfactory, the strategy for alignment and tree searches used here makes large-scale phylogenetics of super-diverse groups such as Coleoptera amenable to desktop computing.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have