Abstract

Targeted high-throughput sequencing using hybrid-enrichment offers a promising source of data for inferring multiple, meaningfully resolved, independent gene trees suitable to address challenging phylogenetic problems in species complexes and rapid radiations. The targets in question can either be adopted directly from more or less universal tools, or custom made for particular clades at considerably greater effort. We applied custom made scripts to select sets of homologous sequence markers from transcriptome and WGS data for use in the flowering plant genus Erica (Ericaceae). We compared the resulting targets to those that would be selected both using different available tools (Hyb-Seq; MarkerMiner), and when optimising for broader clades of more distantly related taxa (Ericales; eudicots). Approaches comparing more divergent genomes (including MarkerMiner, irrespective of input data) delivered fewer and shorter potential markers than those targeted for Erica. The latter may nevertheless be effective for sequence capture across the wider family Ericaceae. We tested the targets delivered by our scripts by obtaining an empirical dataset. The resulting sequence variation was lower than that of standard nuclear ribosomal markers (that in Erica fail to deliver a well resolved gene tree), confirming the importance of maximising the lengths of individual markers. We conclude that rather than searching for “one size fits all” universal markers, we should improve and make more accessible the tools necessary for developing “made to measure” ones.

Highlights

  • DNA sequence data is the cornerstone of comparative and evolutionary research, invaluable for inference of population-level processes and species delimitation through to higher level relationships

  • We developed custom-made Python 2.7.6 scripts to identify the wider pool of all potential target sequences from transcriptome and whole genome sequencing (WGS) data, as well as applying already available scripts/software for comparison

  • When sequence variation is appropriate and gene trees are consistent, standard Sanger sequencing of a small number of markers may be all that is required to infer robust and meaningful phylogenetic trees

Read more

Summary

Introduction

DNA sequence data is the cornerstone of comparative and evolutionary research, invaluable for inference of population-level processes and species delimitation through to higher level relationships. Universal primers such as for plastid (Taberlet et al, 1991), nuclear ribosomal (White et al, 1990) and even single or low copy nuclear (Blattner, 2016) sequences have been widely applied to infer evolutionary histories. When it is not possible to generate a robust and unambiguous phylogenetic hypothesis using standard universal markers, protocols for alternative low copy genes are highly desirable (Sang, 2002; Hughes, Eastwood & Bailey, 2006)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call