Abstract

Density of taxon sampling and number/kind of characters are central to achieving the ultimate goals in phylogenetic reconstruction: tree robustness and improved accuracy. In molecular phylogenetics, DNA sequence repositories such as GenBank are potential sources for expanding datasets in two dimensions, taxa and characters, to the level of “supermatrices.” However, the issue of missing characters/genomic regions is generally considered a major impediment to this endeavor. We used here the angiosperm order Caryophyllales to systematically address the impact of missing data when expanding taxon sampling and number of characters in phylogenetic reconstruction. Our analyses show that expansion of taxon sampling by ~13-fold resulted in improved phylogenetic assessment of the Caryophyllales despite up to 38% missing data. Expanding number of characters in the dataset by allowing for up to 100-fold increase in amount of missing data and inclusion of entries with about 40% missing genomic regions did not negatively impact tree structure or robustness, but to the contrary improved both. These results are timely regarding the ongoing efforts to achieve detailed assessment of the tree of life.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call