Abstract
Researchers studying nonmodel organisms have an increasing number of methods available for generating genomic data. However, the applicability of different methods across species, as well as the effect of reference genome choice on population genomic inference, remain difficult to predict in many cases. We evaluated the impact of data type (whole-genome vs. reduced representation) and reference genome choice on data quality and on population genomic and phylogenomic inference across several species of darters (subfamily Etheostomatinae), a highly diverse radiation of freshwater fish. We generated a high-quality reference genome and developed a hybrid RADseq/sequence capture (Rapture) protocol for the Arkansas darter (Etheostoma cragini). Rapture data from 1,900 individuals spanning four darter species showed recovery of most loci across darter species at high depth and consistent estimates of heterozygosity regardless of reference genome choice. Loci with baits spanning both sides of the restriction enzyme cut site performed especially well across species. For low-coverage whole-genome data, choice of reference genome affected read depth and inferred heterozygosity. For similar amounts of sequence data, Rapture performed better at identifying fine-scale genetic structure compared to whole-genome sequencing. Rapture loci also recovered an accurate phylogeny for the study species and demonstrated high phylogenetic informativeness across the evolutionary history of the genus Etheostoma. Low cost and high cross-species effectiveness regardless of reference genome suggest that Rapture and similar sequence capture methods may be worthwhile choices for studies of diverse species radiations.
Highlights
The advent of high-throughput sequencing (HTS) technology has enabled biologists to generate genome-scale molecular data from a variety of organisms, creating new opportunities for conservation genetics (Shafer et al 2015), phylogenetics (Lemmon and Lemmon 2013, McCormack et al 2013), and molecular ecology (Ekblom & Gallindo 2011)
We ask the following questions to gauge the performance and applicability of the method: 1) How often are loci sequenced using the restriction-associated DNA sequencing (RADseq)/sequence capture (Rapture) baits recovered at high coverage (>20x), and how many reads per individual are needed to attain high coverage?; 2) How much diversity is present within the set of Rapture loci for both the target species and for other darter species?; 3) Can the Rapture loci identify distinct population units within E. cragini?; and 4) Do the Rapture loci recover known phylogenetic relationships among and within species? We demonstrate how the choice of data type (Rapture vs. WGS)
Should I sequence loci over the entire genome or should I use sequence capture to target a smaller number of loci at high depth? Should I generate a reference genome for my species or will I be able to use a reference genome from a closely related species, and how will this choice affect the interpretation of my data? Will one methodology work well across all target populations and species? And how cost-effective are these alternative methods? All of these questions are perhaps even more relevant for projects aimed at diverse species radiations, as such projects by their nature encompass a number of closely related species
Summary
The advent of high-throughput sequencing (HTS) technology has enabled biologists to generate genome-scale molecular data from a variety of organisms, creating new opportunities for conservation genetics (Shafer et al 2015), phylogenetics (Lemmon and Lemmon 2013, McCormack et al 2013), and molecular ecology (Ekblom & Gallindo 2011). Non-model organisms represent fruitful study systems for answering basic questions in biology (Russell et al 2017), deciding on appropriate methods for generating and handling genomic data for nonmodel species remains a challenge. Whole-genome sequencing may still remain out of reach for large-scale studies of non-model organisms, and as such reduced-representation approaches have grown popular as effective means for answering many questions (da Fonseca et al 2016, Meek and Larson 2019). A hybrid method that uses restriction-associated DNA sequencing (RADseq) combined with targeted enrichment of a user-defined subset of hundreds to thousands of RAD loci, termed ‘Rapture’ (Ali et al 2016) has great potential as a rapid and efficient method for generating repeatable highthroughput genomic data at low cost and high efficiency. The application of Rapture has mainly focused on population genomics within species, Rapture loci developed for one species have been shown to be useful for studying hybridization among closely related species (Peek et al 2019) and across species within slowlyevolving lineages (Komoroske et al 2019)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.