Abstract
BackgroundAnalysis and prediction of complex traits using microbiome data combined with host genomic information is a topic of utmost interest. However, numerous questions remain to be answered: how useful can the microbiome be for complex trait prediction? Are estimates of microbiability reliable? Can the underlying biological links between the host’s genome, microbiome, and phenome be recovered?MethodsHere, we address these issues by (i) developing a novel simulation strategy that uses real microbiome and genotype data as inputs, and (ii) using variance-component approaches (Bayesian Reproducing Kernel Hilbert Space (RKHS) and Bayesian variable selection methods (Bayes C)) to quantify the proportion of phenotypic variance explained by the genome and the microbiome. The proposed simulation approach can mimic genetic links between the microbiome and genotype data by a permutation procedure that retains the distributional properties of the data.ResultsUsing real genotype and rumen microbiota abundances from dairy cattle, simulation results suggest that microbiome data can significantly improve the accuracy of phenotype predictions, regardless of whether some microbiota abundances are under direct genetic control by the host or not. This improvement depends logically on the microbiome being stable over time. Overall, random-effects linear methods appear robust for variance components estimation, in spite of the typically highly leptokurtic distribution of microbiota abundances. The predictive performance of Bayes C was higher but more sensitive to the number of causative effects than RKHS. Accuracy with Bayes C depended, in part, on the number of microorganisms’ taxa that influence the phenotype.ConclusionsWhile we conclude that, overall, genome-microbiome-links can be characterized using variance component estimates, we are less optimistic about the possibility of identifying the causative host genetic effects that affect microbiota abundances, which would require much larger sample sizes than are typically available for genome-microbiome-phenome studies. The R code to replicate the analyses is in https://github.com/miguelperezenciso/simubiome.
Highlights
Analysis and prediction of complex traits using microbiome data combined with host genomic information is a topic of utmost interest
We compared the predictive performance of the Bayesian RKHS (GBLUP-like approach) and Bayes C [20] approaches when both genome and microbiome data were included in the model (Rgb and Cgb), including an interaction term between g and b (Rgbx), or only genome data (Rg, Cg), or only microbiome data (Rb, Cb)
We found that variance component estimates from RKHS were both slightly less biased and much less sensitive to the number of causative effects (OTU or SNPs) than estimates from Bayes C
Summary
Analysis and prediction of complex traits using microbiome data combined with host genomic information is a topic of utmost interest. Pérez‐Enciso et al Genet Sel Evol (2021) 53:65 microbiability being larger than zero is that the whole microbiome can be used to predict complex phenotypes, regardless of whether it is a disease or a production trait. This is an important issue since the use of microbiome data has the potential to alter how medical diagnoses in humans or management and breeding decisions in agricultural species are performed. Since the groundbreaking study of Meuwissen et al [20], prediction of complex traits using genomic information has been embraced in both plant [21] and animal breeding [22], as well as in human genetics [23]. Combining the host’s genome and microbiome information is a natural step to improve the prediction of complex traits, a topic that is currently receiving much attention [16, 24]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.