Abstract

BackgroundMicrobiome/metagenomic data have specific characteristics, including varying total sequence reads, over-dispersion, and zero-inflation, which require tailored analytic tools. Many microbiome/metagenomic studies follow a longitudinal design to collect samples, which further complicates the analysis methods needed. A flexible and efficient R package is needed for analyzing processed multilevel or longitudinal microbiome/metagenomic data.ResultsNBZIMM is a freely available R package that provides functions for setting up and fitting negative binomial mixed models, zero-inflated negative binomial mixed models, and zero-inflated Gaussian mixed models. It also provides functions to summarize the results from fitted models, both numerically and graphically. The main functions are built on top of the commonly used R packages nlme and MASS, allowing us to incorporate the well-developed analytic procedures into the framework for analyzing over-dispersed and zero-inflated count or proportion data with multilevel structures (e.g., longitudinal studies). The statistical methods and their implementations in NBZIMM particularly address the data characteristics and the complex designs in microbiome/metagenomic studies. The package is freely available from the public GitHub repository https://github.com/nyiuab/NBZIMM.ConclusionThe NBZIMM package provides useful tools for complex microbiome/metagenomics data analysis.

Highlights

  • Microbiome/metagenomic data have specific characteristics, including varying total sequence reads, over-dispersion, and zero-inflation, which require tailored analytic tools

  • The other R packages can set up both NBMMs and zero-inflated negative binomial mixed models (ZINBMMs) but may not be ideal in analyzing microbiome/metagenomics data due to computational efficiency

  • We have evaluated and compared the three methods NBMMs, ZINBMMs and zero-inflated Gaussian mixed models (ZIGMMs) implemented in Negative binomial and zero-inflated mixed models (NBZIMM) with various existing methods in different R packages

Read more

Summary

Results

In the recent published microbiome studies [19, 20], both sequence data and processed abundance data tables were made available. The following example analyze all the taxa with the proportion of non-zero values > 0.2 through the term min.p with NBMMs and ZINBMMs. Visualize the results To visualize the results through NBZIMM, there are several options available in the package. The term Method needs to be set as ‘zig’ for ZIGMMs. The following example analyzes all the taxa with the proportion of non-zero values > 0.2 through the term min.p using ZIGMMs. Besides, for visualization of the results, the options are the same as for NBMMs and ZINBMMs. Demonstrations of ZIGMMs in analyzing longitudinal microbiome/metagenomics proportion data ZIGMMs is applicable in analyzing longitudinal microbiome/metagenomics proportion data with arcsine square root transformation through the functions lme.zig and mms. In function lme.zig, we have described the terms fixed, random, data, zi_fixed, zi_random, and correlation in “Demonstrations of ZIGMMs in analyzing longitudinal microbiome/metagenomics count data” section. Another option is to generate a heat map as an example shown in Fig. 7 using the function hea t.p

Background
Conclusions
Methods
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call