Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads

Li C Xia,Fengzhu Sun,Jed A Fuhrman,Jacob A Cram,Ting Chen

doi:10.1371/journal.pone.0027992

Li C Xia, Fengzhu Sun + Show 3 more

Open Access

https://doi.org/10.1371/journal.pone.0027992

Copy DOI

Journal: PLoS ONE	Publication Date: Dec 6, 2011
Citations: 143	License type: CC BY 4.0

Affiliation: University of Southern California, Tsinghua University

Abstract

Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data- sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes.

Highlights

Microbial organisms are ubiquitous dwellers of the earth’s biosphere whose activities shape the earth’s biogeochemistry
The GRAMMy framework The GRAMMy framework is based on a mixture model for the short metagenomic reads and an Expectation Maximization (EM) algorithm, as outlined in the model schema and the analysis flowchart in Figures 1 and 2
We have developed the GRAMMy framework for estimating genome relative abundance with shotgun metagenomic reads

Summary

Introduction

Microbial organisms are ubiquitous dwellers of the earth’s biosphere whose activities shape the earth’s biogeochemistry. The knowledge of their presence and abundance in nature is of great relevance to ecology as well as to human well-being. To study microbes in natural environments, researchers frequently apply whole genome shotgun sequencing to uncultured samples to generate genomic sequence reads reflecting the structure of microbial communities [2,3]. As a consequence of the random sampling and sequencing scheme of the shotgun metagenomics approach, the presence and abundance information of metagenomes is preserved in raw reads some studies have shown that biases in sampling can occur, as is true for virtually all approaches [4]. The subsequent analysis of metagenomic data remains a challenging computational problem because of the mixed nature of metagenomes and the fact that we only sequence a small fraction of them

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Temporal and spatial variability in availability bias has consequences for marine bird abundance estimates during the non‐breeding season
Ruth E Dunn ... Jonathan A Green
Ecological Solutions and Evidence | VOL. 5
Ruth E Dunn, et. al.Ruth E Dunn ... Jonathan A Green
01 Oct 2024
Ecological Solutions and Evidence | VOL. 5

A statistical framework for accurate taxonomic assignment of metagenomic sequencing reads.
Hongmei Jiang ... Gang Feng
PLoS ONE | VOL. 7
Hongmei Jiang, et. al.Hongmei Jiang ... Gang Feng
01 Oct 2012
PLoS ONE | VOL. 7

The effect of D123 wheat as a companion crop on soil enzyme activities, microbial biomass and microbial communities in the rhizosphere of watermelon.
Weihui Xu ... Zhigang Wang
Frontiers in microbiology | VOL. 6
Weihui Xu, et. al.Weihui Xu ... Zhigang Wang
01 Sep 2015
Frontiers in microbiology | VOL. 6

Imperfect Estimation of Lepeophtheirus salmonis Abundance and Its Impact on Salmon Lice Treatment on Atlantic Salmon Farms
Jaewoon Jeong ... Marit Stormoen
Frontiers in Marine Science | VOL. 8
Jaewoon Jeong, et. al.Jaewoon Jeong ... Marit Stormoen
20 Oct 2021
Frontiers in Marine Science | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accurate Genome Relative Abundance Estimation Based on Shotgun Metagenomic Reads

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE