Abstract

High-throughput parallel sequencing is a powerful tool for the quantification of microbial diversity through the amplification of nuclear ribosomal gene regions. Recent work has extended this approach to the quantification of diversity within otherwise difficult-to-study metazoan groups. However, nuclear ribosomal genes present both analytical challenges and practical limitations that are a consequence of the mutational properties of nuclear ribosomal genes. Here we exploit useful properties of protein-coding genes for cross-species amplification and denoising of 454 flowgrams. We first use experimental mixtures of species from the class Collembola to amplify and pyrosequence the 5′ region of the COI barcode, and we implement a new algorithm called PyroClean for the denoising of Roche GS FLX pyrosequences. Using parameter values from the analysis of experimental mixtures, we then analyse two communities sampled from field sites on the island of Tenerife. Cross-species amplification success of target mitochondrial sequences in experimental species mixtures is high; however, there is little relationship between template DNA concentrations and pyrosequencing read abundance. Homopolymer error correction and filtering against a consensus reference sequence reduced the volume of unique sequences to approximately 5% of the original unique raw reads. Filtering of remaining non-target sequences attributed to PCR error, sequencing error, or numts further reduced unique sequence volume to 0.8% of the original raw reads. PyroClean reduces or eliminates the need for an additional, time-consuming step to cluster reads into Operational Taxonomic Units, which facilitates the detection of intraspecific DNA sequence variation. PyroCleaned sequence data from field sites in Tenerife demonstrate the utility of our approach for quantifying evolutionary diversity and its spatial structure. Comparison of our sequence data to public databases reveals that we are able to successfully recover both interspecific and intraspecific sequence diversity.

Highlights

  • Challenges for quantifying microbial and meiofaunal community diversity have seen high-throughput parallel (HTP) sequencing used as a direct approach to the problem (e.g. [1,2,3,4,5,6,7])

  • The evolutionary properties of protein-coding genes within the mitochondrial genome provide the potential for universal cross-taxon amplification [16], and several recent studies demonstrate the potential for mitochondrial DNA (mtDNA) Cytochrome Oxidase subunit 1 (COI) primers to capture sample diversity for pyrosequencing [15,17,18]

  • In addition to these criteria, it is desirable to maximise the capture of taxonomic diversity within a focal group, and the faster evolutionary substitution rate of the mtDNA genome over nuclear ribosomal RNA (rRNA) genes provides for greatly enhanced taxonomic resolution

Read more

Summary

Introduction

Challenges for quantifying microbial and meiofaunal community diversity have seen high-throughput parallel (HTP) sequencing used as a direct approach to the problem (e.g. [1,2,3,4,5,6,7]). The gene of choice for amplicon HTP sequencing of bacteria has been the 16S small-subunit ribosomal gene, due to its ubiquitous presence in microbes, and conserved sequence motifs facilitating cross-species amplification [1,2,3,4]. A consideration of important amplicon criteria for diversity analyses utilising HTP sequencing, and the relative merits of mitochondrial DNA (mtDNA) genes, protein-coding genes, and the Cytochrome Oxidase subunit 1 (COI) gene, suggests there are distinct advantages over nuclear rRNA. The evolutionary properties of protein-coding genes within the mitochondrial genome provide the potential for universal cross-taxon amplification [16], and several recent studies demonstrate the potential for mtDNA COI primers to capture sample diversity for pyrosequencing [15,17,18]. In addition to these criteria, it is desirable to maximise the capture of taxonomic diversity within a focal group, and the faster evolutionary substitution rate of the mtDNA genome over nuclear rRNA genes provides for greatly enhanced taxonomic resolution

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.