Database limitations for studying the human gut microbiome.

Camila K Dias,Victor S Pylro,Robert Starke,Daniel K Morais

doi:10.7717/peerj-cs.289

Abstract

BackgroundIn the last twenty years, new methodologies have made possible the gathering of large amounts of data concerning the genetic information and metabolic functions associated to the human gut microbiome. In spite of that, processing all this data available might not be the simplest of tasks, which could result in an excess of information awaiting proper annotation. This assessment intended on evaluating how well respected databases could describe a mock human gut microbiome.MethodsIn this work, we critically evaluate the output of the cross–reference between the Uniprot Knowledge Base (Uniprot KB) and the Kyoto Encyclopedia of Genes and Genomes Orthologs (KEGG Orthologs) or the evolutionary genealogy of genes: Non-supervised Orthologous groups (EggNOG) databases regarding a list of species that were previously found in the human gut microbiome.ResultsFrom a list which contemplates 131 species and 52 genera, 53 species and 40 genera had corresponding entries for KEGG Database and 82 species and 47 genera had corresponding entries for EggNOG Database. Moreover, we present the KEGG Orthologs (KOs) and EggNOG Orthologs (NOGs) entries associated to the search as their distribution over species and genera and lists of functions that appeared in many species or genera, the “core” functions of the human gut microbiome. We also present the relative abundance of KOs and NOGs throughout phyla and genera. Lastly, we expose a variance found between searches with different arguments on the database entries. Inferring functionality based on cross-referencing UniProt and KEGG or EggNOG can be lackluster due to the low number of annotated species in Uniprot and due to the lower number of functions affiliated to the majority of these species. Additionally, the EggNOG database showed greater performance for a cross-search with Uniprot about a mock human gut microbiome. Notwithstanding, efforts targeting cultivation, single-cell sequencing or the reconstruction of high-quality metagenome-assembled genomes (MAG) and their annotation are needed to allow the use of these databases for inferring functionality in human gut microbiome studies.

Highlights

High-throughput sequencing (HTS) of DNA allows for the comparative analyses of diversity, abundance, important functional genes and their traits, without the need of cultivating individual microbes and at far greater depths than ever before (Weinstock, 2012)
When comparing our list of genera with newer studies (David et al, 2014; Zhernakova et al, 2016; Chauhan et al, 2018; Johnson, 2020), representing 6 years of technological evolution, we found that only 19 new genera were included (Table S2), this result tells us that the addition of more studies wouldn’t change our outcomes and that there is need to direct database update effort on unknown taxa as, the newer studies brought many genus that were already in our list and were lacking in the current databases
After filtering out the Uniprot entries not belonging to Bacteria or Archaea, and removing those without functional annotation, we got 6,531,071 entries for KEGG Orthologs (KOs) and 4,749,622 entries for evolutionary genealogy of genes: Nonsupervised Orthologous groups (EggNOG) IDs

Summary

Introduction

High-throughput sequencing (HTS) of DNA allows for the comparative analyses of diversity, abundance, important functional genes and their traits, without the need of cultivating individual microbes and at far greater depths than ever before (Weinstock, 2012). Besides the issues with the methodology, there are issues with database updates considering the ever-growing amount of data (Song, Lee & Nam, 2018) Another issue that has to be dealt with when researching a given microbiota is the choice of binning approach, a group of methods that can be used to cluster contigs into what would be representative of a single population genome. In spite of that, processing all this data available might not be the simplest of tasks, which could result in an excess of information awaiting proper annotation This assessment intended on evaluating how well respected databases could describe a mock human gut microbiome. Notwithstanding, efforts targeting cultivation, single-cell sequencing or the reconstruction of high-quality metagenomeassembled genomes (MAG) and their annotation are needed to allow the use of these databases for inferring functionality in human gut microbiome studies

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PeerJ Computer Science	Publication Date: Aug 17, 2020
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Database limitations for studying the human gut microbiome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ Computer Science

Lead the way for us

Similar Papers

The Human Gut Microbiome: The Ghost in the Machine
Audrey Feeney ... Roy D Sleator
Future Microbiology | VOL. 7
Audrey Feeney, et. al.Audrey Feeney ... Roy D Sleator
17 Oct 2012
Future Microbiology | VOL. 7

Gut Microbiota and Its Possible Relationship With Obesity
John K Dibaise ... Rosa Krajmalnik-Brown
Mayo Clinic Proceedings | VOL. 83
John K Dibaise, et. al.John K Dibaise ... Rosa Krajmalnik-Brown
01 Apr 2008
Mayo Clinic Proceedings | VOL. 83

Assessment of metagenomic workflows using a newly constructed human gut microbiome mock community.
Hiroshi Mori ... Atsushi Toyoda
DNA research : an international journal for rapid publication of reports on genes and genomes | VOL. 30
Hiroshi Mori, et. al.Hiroshi Mori ... Atsushi Toyoda
31 May 2023
DNA research : an international journal for rapid publication of reports on genes and genomes | VOL. 30

Meta'omic Analytic Techniques for Studying the Intestinal Microbiome
Xochitl C Morgan ... Curtis Huttenhower
Gastroenterology | VOL. 146
Xochitl C Morgan, et. al.Xochitl C Morgan ... Curtis Huttenhower
28 Jan 2014
Gastroenterology | VOL. 146

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Database limitations for studying the human gut microbiome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ Computer Science