Abstract
Whole-genome amplification by multiple displacement amplification (MDA) is a promising technique to enable the use of samples with only limited amount of DNA for the construction of RAD-seq libraries. Previous work has shown that, when the amount of DNA used in the MDA reaction is large, double-digest RAD-seq (ddRAD) libraries prepared with amplified genomic DNA result in data that are indistinguishable from libraries prepared directly from genomic DNA. Based on this observation, here we evaluate the quality of ddRAD libraries prepared from MDA-amplified genomic DNA when the amount of input genomic DNA and the coverage obtained for samples is variable. By simultaneously preparing libraries for five species of weevils (Coleoptera, Curculionidae), we also evaluate the likelihood that potential contaminants will be encountered in the assembled dataset. Overall, our results indicate that MDA may not be able to rescue all samples with small amounts of DNA, but it does produce ddRAD libraries adequate for studies of phylogeography and population genetics even when conditions are not optimal. We find that MDA makes it harder to predict the number of loci that will be obtained for a given sequencing effort, with some samples behaving like traditional libraries and others yielding fewer loci than expected. This seems to be caused both by stochastic and deterministic effects during amplification. Further, the reduction in loci is stronger in libraries with lower amounts of template DNA for the MDA reaction. Even though a few samples exhibit substantial levels of contamination in raw reads, the effect is very small in the final dataset, suggesting that filters imposed during dataset assembly are important in removing contamination. Importantly, samples with strong signs of contamination and biases in heterozygosity were also those with fewer loci shared in the final dataset, suggesting that stringent filtering of samples with significant amounts of missing data is important when assembling data derived from MDA-amplified genomic DNA. Overall, we find that the combination of MDA and ddRAD results in high-quality datasets for population genetics as long as the sequence data is properly filtered during assembly.
Highlights
Double-digest RAD sequencing (Peterson et al, 2012) and other methods of genotyping-by-sequencing are inexpensive and flexible tools that allow researchers to sequence a large number of loci from non-model organisms for phylogenetic and population-level studies (Andrews et al, 2016)
We modeled the relationship between this dissimilarity matrix as a response and several predictors that we believe could be associated with the recovery of a more similar set of final loci (MDA, population, size selection pool and log-transformed number of loci in the final dataset for each sample) using a multivariate distance matrix regression (MDMR) (Anderson, 2001; Mcardle & Anderson, 2001) implemented in the R package MDMR v. 0.5.0 (McArtor, Lubke & Bergeman, 2016; McArtor, 2017, 2018)
There is generally a decrease in the number of assembled loci for a given number of reads (Fig. 3). This contrasts with the results of Blair, Campbell & Yoder (2015), who did not find any difference in the number of loci recovered from multiple displacement amplification (MDA) and direct libraries. This difference might be explained, at least in part, by the smaller amount of input DNA used in some samples here, since we found a significant effect of amount of input genomic DNA in an MDA reaction on the number of loci, after controlling for number of reads obtained (Fig. 2)
Summary
Double-digest RAD sequencing (ddRAD) (Peterson et al, 2012) and other methods of genotyping-by-sequencing are inexpensive and flexible tools that allow researchers to sequence a large number of loci from non-model organisms for phylogenetic and population-level studies (Andrews et al, 2016). One caveat is that this protocol is still not highly optimized, requiring a very large read depth in comparison to ddRAD and resulting in a much larger cost per sample in addition to the higher cost for library preparation Another option for samples with smaller DNA amounts is to increase the DNA available per individual by using whole-genome amplification prior to RAD library preparation. In the context of ddRAD, Blair, Campbell & Yoder (2015) sequenced four samples of a single species at high coverage and using a high amount of starting DNA, following the manufacturer’s protocol for the reaction of whole-genome amplification Even though their results were encouraging, it remains to be shown whether whole-genome amplification is robust in more typical conditions in which it might be used: the study of many samples with uneven coverage and small quantities of DNA available
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.