Abstract

BackgroundMicrobiome studies have uncovered associations between microbes and human, animal, and plant health outcomes. This has led to an interest in developing microbial interventions for treatment of disease and optimization of crop yields which requires identification of microbiome features that impact the outcome in the population of interest. That task is challenging because of the high dimensionality of microbiome data and the confounding that results from the complex and dynamic interactions among host, environment, and microbiome. In the presence of such confounding, variable selection and estimation procedures may have unsatisfactory performance in identifying microbial features with an effect on the outcome.ResultsIn this manuscript, we aim to estimate population-level effects of individual microbiome features while controlling for confounding by a categorical variable. Due to the high dimensionality and confounding-induced correlation between features, we propose feature screening, selection, and estimation conditional on each stratum of the confounder followed by a standardization approach to estimation of population-level effects of individual features. Comprehensive simulation studies demonstrate the advantages of our approach in recovering relevant features. Utilizing a potential-outcomes framework, we outline assumptions required to ascribe causal, rather than associational, interpretations to the identified microbiome effects. We conducted an agricultural study of the rhizosphere microbiome of sorghum in which nitrogen fertilizer application is a confounding variable. In this study, the proposed approach identified microbial taxa that are consistent with biological understanding of potential plant-microbe interactions.ConclusionsStandardization enables more accurate identification of individual microbiome features with an effect on the outcome of interest compared to other variable selection and estimation procedures when there is confounding by a categorical variable.

Highlights

  • Advancements in next-generation sequencing (NGS) technologies have recently allowed for unprecedented examination of the community of microorganisms in a host or site of interest, referred to as a microbiome [29]

  • We either model each microbiome feature effect as common across all confounder strata or allow for effect modification through stratum-specific microbiome feature effects denoted with the suffix “EffMod.” For each of the six models under comparison, we investigate the performance of variable section using the least absolute shrinkage and selection operator (LASSO) and smoothly clipped absolute deviation (SCAD) penalties for p both with and without screening, as well as the proposed inference procedure using the debiased LASSO with iterative sure independence screening (SIS)

  • Simulation performance was summarized across all 100 simulated data sets for each scenario, model, and variable selection method considered using the true positive rate (TPR) and false positive rate (FPR)

Read more

Summary

Introduction

Advancements in next-generation sequencing (NGS) technologies have recently allowed for unprecedented examination of the community of microorganisms in a host or site of interest, referred to as a microbiome [29]. NGS technologies can rapidly detect thousands of microbes in each sample by determining the nucleotide sequences of short microbial DNA fragments These fragments may either correspond to targets of a specific genetic marker, commonly the 16S ribosomal RNA gene for taxonomic identification of bacteria as in amplicon sequencing, or result from shearing all the DNA in a sample as in shotgun metagenome sequencing [40]. The corresponding nucleotide sequence is referred to as a “read,” the length of which is dependent on the specific NGS system [33] Both amplicon-based and shotgun metagenomic approaches can enumerate the relative abundance of thousands of microbial features per sample. In the presence of such confounding, variable selection and estimation procedures may have unsatisfactory performance in identifying microbial features with an effect on the outcome

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.