Abstract

A robust phylogenetic framework, in terms of extensive geographical and taxonomic sampling, well-resolved species relationships and high certainty of tree topologies and branch length estimations, is critical in the study of macroevolutionary patterns. Whereas Sanger sequencing-based methods usually recover insufficient phylogenetic signal, especially in recently diversified lineages, reduced-representation sequencing methods tend to provide well-supported phylogenetic relationships, but usually entail remarkable bioinformatic challenges due to the inherent trade-off between the number of SNPs and the magnitude of associated error rates. The genus Helianthemum (Cistaceae) is a species-rich and taxonomically complex Palearctic group of plants that diversified mainly since the Upper Miocene. It is a challenging case study since previous attempts using Sanger sequencing were unable to resolve the intrageneric phylogenetic relationships. Aiming to obtain a robust phylogenetic reconstruction based on genotyping-by-sequencing (GBS), we established a rigorous methodological workflow in which we i) explored how variable settings during dataset assembly have an impact on error rates and on the degree of resolution under concatenation and coalescent approaches, ii) assessed the effect of two extreme parameter configurations (minimizing error rates vs. maximizing phylogenetic resolution) on tree topology and branch lengths, and iii) evaluated the effects of these two configurations on estimates of divergence times and diversification rates. Our analyses produced highly supported topologically congruent phylogenetic trees for both configurations. However, minimizing error rates did produce more reliable branch lengths, critically affecting the accuracy of downstream analyses (i.e. divergence times and diversification rates). In addition to recommending a revision of intrageneric systematics, our results enabled us to identify three highly diversified lineages in Helianthemum in contrasting geographical areas and ecological conditions, which started radiating in the Upper Miocene.

Highlights

  • The establishment of a robust phylogenetic framework is the initial step for the study of macroevolutionary patterns of specific lineages and requires extensive geographical and taxonomic representativeness, strong statistical support for species relationships and accurate estimates of tree topology and branch lengths

  • A similarity threshold of 90% always recovered error rates lower than those obtained under the 85% threshold and under the combination of 90% in step 3, and 85% in step 6

  • Compared to the previous phylogenetic reconstruction of the genus Helianthemum using Sanger sequencing, in which species and subspecies were mostly recovered in polytomies (Aparicio et al, 2017), here we generated a much more robust species and subspecies-level phylogenetic tree incorporating high geographical and taxonomic representativeness, strong statistical support for taxon relationships, and accurate estimates of tree topology and branch lengths

Read more

Summary

Introduction

The establishment of a robust phylogenetic framework is the initial step for the study of macroevolutionary patterns of specific lineages and requires extensive geographical and taxonomic representativeness, strong statistical support for species relationships and accurate estimates of tree topology and branch lengths. These goals cannot be achieved in phylogenetic analyses of recently diversified lineages when Sanger sequencing approaches are used. These methods based on NextGeneration Sequencing (NGS) present notable methodological challenges that include i) the high DNA quality generally required (Andrews et al, 2016), ii) the complexity of the assembly and bioinformatic processing (Shafer et al, 2017), iii) the constraints and assumptions of the two approaches currently used in phylogenomics (i.e. concatenation and coalescent approaches; Meiklejohn et al, 2016), iv) the limits of available computing power (Glor, 2010), and v) the biological limitations on data collection (i.e. allele dropout because of mutations at restriction sites; Andrews et al, 2016; Table S1)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call