Abstract

BackgroundMicrobial life dominates the earth, but many species are difficult or even impossible to study under laboratory conditions. Sequencing DNA directly from the environment, a technique commonly referred to as metagenomics, is an important tool for cataloging microbial life. This culture-independent approach involves collecting samples that include microbes in them, extracting DNA from the samples, and sequencing the DNA. A sample may contain many different microorganisms, macroorganisms, and even free-floating environmental DNA. A fundamental challenge in metagenomics has been estimating the abundance of organisms in a sample based on the frequency with which the organism's DNA was observed in reads generated via DNA sequencing.Methodology/Principal FindingsWe created mixtures of ten microbial species for which genome sequences are known. Each mixture contained an equal number of cells of each species. We then extracted DNA from the mixtures, sequenced the DNA, and measured the frequency with which genomic regions from each organism was observed in the sequenced DNA. We found that the observed frequency of reads mapping to each organism did not reflect the equal numbers of cells that were known to be included in each mixture. The relative organism abundances varied significantly depending on the DNA extraction and sequencing protocol utilized.Conclusions/SignificanceWe describe a new data resource for measuring the accuracy of metagenomic binning methods, created by in vitro-simulation of a metagenomic community. Our in vitro simulation can be used to complement previous in silico benchmark studies. In constructing a synthetic community and sequencing its metagenome, we encountered several sources of observation bias that likely affect most metagenomic experiments to date and present challenges for comparative metagenomic studies. DNA preparation methods have a particularly profound effect in our study, implying that samples prepared with different protocols are not suitable for comparative metagenomics.

Highlights

  • The vast majority of life on earth is microbial, and efforts to study many of these organisms via laboratory culture have met with limited success, leading to use of the term ‘‘the uncultured majority’’ when describing microbial life on earth [1]

  • We describe an in vitro metagenomic simulation intended to inform and complement the in silico simulations used by others for benchmarking

  • Given that a known quantity of each organism was mixed in the metagenomic simulation, we investigated whether estimates of organism relative abundance based on sequencing read counts would match the predicted abundance given the way in which our sample was created

Read more

Summary

Introduction

The vast majority of life on earth is microbial, and efforts to study many of these organisms via laboratory culture have met with limited success, leading to use of the term ‘‘the uncultured majority’’ when describing microbial life on earth [1]. Metagenomics holds promise as a means to access the uncultured majority [2,3], and can be broadly defined as the study of microbial communities using high-throughput DNA sequencing technology without requirement for laboratory culture [4,5,6,7]. The procedure is commonly referred to as shotgun metagenomics or environmental shotgun sequencing. Sequencing DNA directly from the environment, a technique commonly referred to as metagenomics, is an important tool for cataloging microbial life. This culture-independent approach involves collecting samples that include microbes in them, extracting DNA from the samples, and sequencing the DNA. A fundamental challenge in metagenomics has been estimating the abundance of organisms in a sample based on the frequency with which the organism’s DNA was observed in reads generated via DNA sequencing

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.