Abstract

Massively parallel sequencing is rapidly emerging as an efficient way to quantify biodiversity at all levels, from genetic variation and expression to ecological community assemblage. However, the number of reads produced per sequencing run far exceeds the number required per sample for many applications, compelling researchers to sequence multiple samples per run in order to maximize efficiency. For studies that include a PCR step, this can be accomplished using primers that include an index sequence allowing sample origin to be determined after sequencing. The use of indexed primers assumes they behave no differently than standard primers; however, we found that indexed primers cause substantial template sequence-specific bias, resulting in radically different profiles of the same environmental sample. Likely the outcome of differential amplification efficiency due to primer-template mismatch, two indexed primer sets spuriously change the inferred sequence abundance from the same DNA extraction by up to 77.1%. We demonstrate that a double PCR approach alleviates these effects in applications where indexed primers are necessary.

Highlights

  • The plummeting cost of DNA sequencing has led to the widespread adoption of DNA sequence-based approaches to a wide variety of biological problems [1,2,3,4]

  • An increasingly popular technique for identifying the biological variants present in a sample comprised of template DNA from multiple sources is the parallel sequencing of nucleotide fragments generated by PCR [5, 6]; this approach has seen application to problems such as the bulk identification of organisms either in a combined tissue (e.g. [7,8,9]) or environmental samples (e.g. [10, 11])

  • A total of 24100 and 35909 OTUs were obtained from the single and double PCR experiments, respectively. To confirm these sequences were from organisms likely to occur in this environment, we used the blastn algorithm [34] to compare our sequences to the NCBI nucleotide database and report the results for the 10 most abundant OTUs of each treatment (S2 Table)

Read more

Summary

Introduction

The plummeting cost of DNA sequencing has led to the widespread adoption of DNA sequence-based approaches to a wide variety of biological problems [1,2,3,4]. An increasingly popular technique for identifying the biological variants (organisms or alleles) present in a sample comprised of template DNA from multiple sources (taxa, genomes, or gene copies) is the parallel sequencing of nucleotide fragments generated by PCR (amplicons) [5, 6]; this approach has seen application to problems such as the bulk identification of organisms either in a combined tissue While the per-nucleotide cost of sequencing has dropped, the cost per run remains substantial, and a single run provides many more sequences than is typically required by such amplicon-based studies [6]. Investigators can maximize cost efficiency by sequencing more than one sample on a single run (multiplex sequencing), but only if sequences can be traced back to their sample of origin.

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call