Abstract

Longitudinal metagenomics data, including both 16S rRNA and whole-metagenome shotgun sequencing data, enhanced our abilities to understand the dynamic associations between the human microbiome and various diseases. However, analytic tools have not been fully developed to simultaneously address the main challenges of longitudinal metagenomics data, i.e. high-dimensionality, dependence among samples and zero-inflation of observed counts. We propose a fast zero-inflated negative binomial mixed modeling (FZINBMM) approach to analyze high-dimensional longitudinal metagenomic count data. The FZINBMM approach is based on zero-inflated negative binomial mixed models (ZINBMMs) for modeling longitudinal metagenomic count data and a fast EM-IWLS algorithm for fitting ZINBMMs. FZINBMM takes advantage of a commonly used procedure for fitting linear mixed models, which allows us to include various types of fixed and random effects and within-subject correlation structures and quickly analyze many taxa. We found that FZINBMM remarkably outperformed in computational efficiency and was statistically comparable with two R packages, GLMMadaptive and glmmTMB, that use numerical integration to fit ZINBMMs. Extensive simulations and real data applications showed that FZINBMM outperformed other previous methods, including linear mixed models, negative binomial mixed models and zero-inflated Gaussian mixed models. FZINBMM has been implemented in the R package NBZIMM, available in the public GitHub repository http://github.com//nyiuab//NBZIMM. Supplementary data are available at Bioinformatics online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call