Abstract

BackgroundRandom community genomes (metagenomes) are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers.ResultsA high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. Phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. User access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing datasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats.ConclusionThe open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis – the availability of high-performance computing for annotating the data.

Highlights

  • Random community genomes are commonly used to study microbes in different environments

  • The metagenomics SEED pipeline was designed to allow alterations to the parameters for the sequence matches underlying both the phylogenetic and metabolic reconstructions to restrict matches. It has been built by using an extensible format allowing the integration of new datasets and algorithms without a need for recomputation of existing results

  • All sequence data remains protected by a password mechanism and is visible only to permitted users. This metagenomics annotation pipeline was developed to handle pyrosequencing data and accommodate some of the nuances associated with that data

Read more

Summary

Introduction

Random community genomes (metagenomes) are commonly used to study microbes in different environments. The explosion of random community genomics, or metagenomics, where DNA is sequenced directly from environmental samples has provided insights into microbial communities. Two approaches to sequencing metagenome samples are commonly used. DNA is sequenced without cloning, using one of the so-called next-generation sequencing techniques, usually pyrosequencing. Sanger sequencing generates longer sequence reads but has inherent biases due to the cloning. Pyrosequencing has much higher throughput and a lower error rate per base sequenced compared to Sanger sequencing, but those errors are biased toward certain mistakes [3]

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.