Abstract

Analysis of shotgun metagenomic data generated from next generation sequencing platforms can be done through a variety of bioinformatic pipelines. These pipelines employ different sets of sophisticated bioinformatics algorithms which may affect the results of this analysis. In this study, we compared two commonly used pipelines for shotgun metagenomic analysis: MG-RAST and Kraken 2, in terms of taxonomic classification, diversity analysis, and usability using their primarily default parameters. Overall, the two pipelines detected similar abundance distributions in the three most abundant taxa Proteobacteria, Firmicutes, and Bacteroidetes. Within bacterial domain, 497 genera were identified by both pipelines, while an additional 694 and 98 genera were solely identified by Kraken 2 and MG-RAST, respectively. 933 species were detected by the two algorithms. Kraken 2 solely detected 3550 species, while MG-RAST identified 557 species uniquely. For archaea, Kraken 2 generated 105 and 236 genera and species, respectively, while MG-RAST detected 60 genera and 88 species. 54 genera and 72 species were commonly detected by the two methods. Kraken 2 had a quicker analysis time (~4 hours) while MG-RAST took approximately 2 days per sample. This study revealed that Kraken 2 and MG-RAST generate comparable results and that a reliable high-level overview of sample is generated irrespective of the pipeline selected. However, Kraken 2 generated a more accurate taxonomic identification given the higher number of “Unclassified” reads in MG-RAST. The observed variations at the genus level show that a main restriction is using different databases for classification of the metagenomic data. The results of this research indicate that a more inclusive and representative classification of microbiomes may be achieved through creation of the combined pipelines.

Highlights

  • Metagenomics is a high-throughput sequencing (HTS) technique commonly used to investigate complex microbial communities in terms of composition, structure, diversity, and function

  • Bacterial profile findings disclosed a comparable taxon distribution among the four most common species categorized by both pipelines (Tables 2 and 3), with Proteobacteria, Firmicutes, Bacteroidetes, and Actinobacteria being most abundant and responsible for about 80% of the total microbial population

  • We were able to carry out a comparative metagenomic assessment of cattle fecal microbial composition using both Kraken 2 and MGRAST algorithms

Read more

Summary

Introduction

Metagenomics is a high-throughput sequencing (HTS) technique commonly used to investigate complex microbial communities in terms of composition, structure, diversity, and function. This culture-independent application has gained importance in microbiological studies over the past decade [1] especially in studies of environmental communities [2, 3], in industrial quality control processes [4], and in understanding the influence of gastrointestinal microbes on the health of human beings and their well-being [5]. Shotgun metagenomics on the other hand uses extraction and sequencing of the complete DNA to study the genomic content of a sample This integrated strategy provides a rich image of the microbiota and offers the chance to study the taxonomic classification

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.