Abstract

Shotgun metagenomics sequencing is a powerful tool for the characterization of complex biological matrices, enabling analysis of prokaryotic and eukaryotic organisms and viruses in a single experiment, with the possibility of reconstructing de novo the whole metagenome or a set of genes of interest. One of the main factors limiting the use of shotgun metagenomics on wide scale projects is the high cost associated with the approach. We set out to determine if it is possible to use shallow shotgun metagenomics to characterize complex biological matrices while reducing costs. We measured the variation of several summary statistics simulating a decrease in sequencing depth by randomly subsampling a number of reads. The main statistics that were compared are alpha diversity estimates, species abundance, and ability of reconstructing de novo the metagenome in terms of length and completeness. Our results show that diversity indices of complex prokaryotic, eukaryotic and viral communities can be accurately estimated with 500,000 reads or less, although particularly complex samples may require 1,000,000 reads. On the contrary, any task involving the reconstruction of the metagenome performed poorly, even with the largest simulated subsample (1,000,000 reads). The length of the reconstructed assembly was smaller than the length obtained with the full dataset, and the proportion of conserved genes that were identified in the meta-genome was drastically reduced compared to the full sample. Shallow shotgun metagenomics can be a useful tool to describe the structure of complex matrices, but it is not adequate to reconstruct—even partially—the metagenome.

Highlights

  • Shotgun metagenomics offers the possibility to assess the complete taxonomic composition of biological matrices and to estimate the relative abundances of each species in an unbiased way[1,2]

  • Metagenome shotgun high-throughput sequencing has progressively gained popularity in parallel with the advancing of next-generation sequencing (NGS) technologies[3,4], which provide more data in less time at a lower cost than previous sequencing techniques

  • The community includes a total of 20 bacterial species, of which 5 have a frequency of 0.02%, 5 a frequency of 0.18%, 5 a frequency of 1.8% and 5 a frequency of 18%

Read more

Summary

Introduction

Shotgun metagenomics offers the possibility to assess the complete taxonomic composition of biological matrices and to estimate the relative abundances of each species in an unbiased way[1,2] It allows to agnostically characterize complex communities containing eukaryotes, fungi, bacteria and viruses. Metagenome shotgun high-throughput sequencing has progressively gained popularity in parallel with the advancing of next-generation sequencing (NGS) technologies[3,4], which provide more data in less time at a lower cost than previous sequencing techniques This allows the extensive application to study the most various biological mixtures such as environmental samples[5,6], gut samples[7,8,9], skin samples[10], clinical samples for diagnostics and surveillance purposes[11,12,13,14] and food ecosystems[15,16]. Several studies suggested that whole shotgun metagenome sequencing is more effective in the characterization of metagenomics samples compared to target amplicon approaches, with the additional capability of providing functional information regarding the studied approaches[24,25]

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call