Abstract
The advent of high-throughput metagenomic sequencing has prompted the development of efficient taxonomic profiling methods allowing to measure the presence, abundance and phylogeny of organisms in a wide range of environmental samples. Multivariate sequence-derived abundance data further has the potential to enable inference of ecological associations between microbial populations, but several technical issues need to be accounted for, like the compositional nature of the data, its extreme sparsity and overdispersion, as well as the frequent need to operate in under-determined regimes. The ecological network reconstruction problem is frequently cast into the paradigm of Gaussian Graphical Models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso and neighborhood selection. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros (as opposed to sampling zeros) corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present here a zero-inflated log-normal graphical model (available at https://github.com/vincentprost/Zi-LN) specifically aimed at handling such "biological" zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets.
Highlights
Metagenomics has increased our awareness that microbes most often live in communities structured by both environmental factors and ecological associations between community members
Computational methods to predict microbial associations can be of practical interest, in particular given the large amounts of multivariate microbial abundance data generated by metagenomics
The advent of high-throughput sequencing and metagenomics resulted in the production of large amounts of multivariate microbial abundance data that could in theory be leveraged to reconstruct microbial association networks [8]
Summary
Metagenomics has increased our awareness that microbes most often live in communities structured by both environmental factors and ecological associations between community members. Understanding the structure and dynamics of microbial communities requires to be able to detect such associations, and is of both fundamental [1] and practical importance as it may enable the design and engineering of consortia of interacting organisms in order to fulfill various needs (e.g. improving the efficiency of wastewater treatment plants [2, 3] and designing more robust fecal transplants [4]) Interactions within these microbial systems can have a positive, negative or null impact on the involved organisms, leading to a typology of pairwise interactions based on combinations of these win or loss outcomes, e.g. Lidicker [5] distinguishes the mutualism (+/+), competition (-/-), predation or parasitism ((+/-) or (-/+)), amenalism ((0/-) or (-/0)) and commensalism ((+/0) or (0/+)) interaction types. Only limited success has been met in terms of robust structure inference of realworld microbial association networks, and different methods frequently yield quite different results [9]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.