Abstract

Human microbiome reference datasets provide epidemiological context for researchers, enabling them to uncover new insights into their own data through meta-analyses. In addition, large and comprehensive reference sets offer a means to develop or test hypotheses and can pave the way for addressing practical study design considerations such as sample size decisions. We discuss the importance of reference sets in human microbiome research, limitations of existing resources, technical challenges to employing reference sets, examples of their usage, and contributions of the American Gut Project to the development of a comprehensive reference set. Through engaging the general public, the American Gut Project aims to address many of the issues present in existing reference resources, characterizing health and disease, lifestyle, and dietary choices of the participants while extending its efforts globally through international collaborations.

Highlights

  • In the last few years, the study of the bacteria, archaea, microbial eukaryotes, and viruses that inhabit the human body has revealed a remarkable biological and functional diversity [1,2,3,4,5,6]

  • The sequence data are readily available for reuse, the distribution of many of the study variables is not approved, limiting the long-term usefulness of the samples. (It should be noted that the Global Gut did not intend to be a reference for microbiome research, but the populations represented in the dataset are extremely difficult to collect samples from and have shown to be useful in adding perspective for independent projects [29, 39])

  • Studies employing a reference set typically rely on the closed-reference approach to minimize compute since only the input study need be evaluated and can be done so in an embarrassingly parallel fashion. Another benefit is that the closed-reference strategy is unlikely to result in operational taxonomic unit (OTU) composed of non-16S sequence, as the reference is expected to only contain 16S exemplars; comprehensive references like Greengenes typically contain only near-full-length reads, allowing researchers to combine data represented by multiple variable regions

Read more

Summary

Introduction

In the last few years, the study of the bacteria, archaea, microbial eukaryotes, and viruses that inhabit the human body ( the large intestine) has revealed a remarkable biological and functional diversity [1,2,3,4,5,6]. The decision to sample a few people extensively rather than a large number of people minimally (i.e., a cross-sectional study design) led to observation of only a small fraction of the diversity present with the population [28] and resulted in small sample sizes for different stratifications in the dataset [36], effectively removing the potential to observe demographic or regional differences.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.