Abstract

BackgroundMetagenomics caused a quantum leap in microbial ecology. However, the inherent size and complexity of metagenomic data limit its interpretation. The quantification of metagenomic traits in metagenomic analysis workflows has the potential to improve the exploitation of metagenomic data. Metagenomic traits are organisms’ characteristics linked to their performance. They are measured at the genomic level taking a random sample of individuals in a community. As such, these traits provide valuable information to uncover microorganisms’ ecological patterns. The Average Genome Size (AGS) and the 16S rRNA gene Average Copy Number (ACN) are two highly informative metagenomic traits that reflect microorganisms’ ecological strategies as well as the environmental conditions they inhabit.ResultsHere, we present the ags.sh and acn.sh tools, which analytically derive the AGS and ACN metagenomic traits. These tools represent an advance on previous approaches to compute the AGS and ACN traits. Benchmarking shows that ags.sh is up to 11 times faster than state-of-the-art tools dedicated to the estimation AGS. Both ags.sh and acn.sh show comparable or higher accuracy than existing tools used to estimate these traits. To exemplify the applicability of both tools, we analyzed the 139 prokaryotic metagenomes of TARA Oceans and revealed the ecological strategies associated with different water layers.ConclusionWe took advantage of recent advances in gene annotation to develop the ags.sh and acn.sh tools to combine easy tool usage with fast and accurate performance. Our tools compute the AGS and ACN metagenomic traits on unassembled metagenomes and allow researchers to improve their metagenomic data analysis to gain deeper insights into microorganisms’ ecology. The ags.sh and acn.sh tools are publicly available using Docker container technology at https://github.com/pereiramemo/AGS-and-ACN-tools.

Highlights

  • Metagenomics caused a quantum leap in microbial ecology

  • The workflow of ags.sh consists of the following steps: 1) Short-read sequences are filtered by length and trimmed with BBDuk [29]; 2) Open Reading Frames (ORFs) are predicted in the short-read sequences with FragGeneScan-Plus [30, 31]; 3) Single-copy genes are annotated with UProC [32]; 4) The gene coverage is estimated as the total number of annotated base pairs divided by the gene length; 5) The number of genomes (NGs) is computed as the mean coverage of the 35 single-copy genes; 6) The Average Genome Size (AGS) is computed as the ratio of the total number of base pairs to the NGs

  • The computation of the 16S rRNA gene average copy number follows a similar methodology: we estimate the coverage of the 16S rRNA genes and divided it by the NGs

Read more

Summary

Introduction

Metagenomics caused a quantum leap in microbial ecology. the inherent size and complexity of metagenomic data limit its interpretation. Metagenomic traits are organisms’ characteristics linked to their performance. They are measured at the genomic level taking a random sample of individuals in a community. As such, these traits provide valuable information to uncover microorganisms’ ecological patterns. Community functional traits measured at the genome level in a random sample of individuals (i.e., metagenomic traits), can help to uncover ecological patterns in short-read metagenomic data [5]. Functional traits are defined as characteristics of an organism that are linked to its performance, and influence its ecology and evolution [6]. Previous studies have used metagenomic traits to explain different aspects of microbial ecology, including why microorganisms live

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.