Univariate regression models have rich literature for counting data. However, this is not the case for multivariate count data. Therefore, we present the Multivariate Generalized Linear Mixed Models framework that deals with a multivariate set of responses, measuring the correlation between them through random effects that follows a multivariate normal distribution. This model is based on a GLMM with a random intercept and the estimation process remains the same as a standard GLMM with random effects integrated out via Laplace approximation. We efficiently implemented this model through the TMB package available in R. We used Poisson, negative binomial (NB), and COM-Poisson distributions. To assess the estimator properties, we conducted a simulation study considering four different sample sizes and three different correlation values for each distribution. We achieved unbiased and consistent estimators for Poisson and NB distributions; for COM-Poisson estimators were consistent, but biased, especially for dispersion, variance, and correlation parameter estimators. These models were applied to two datasets. The first concerns a sample from 30 different sites collected in Australia where the number of times each one of the 41 different ant species was registered; which results in an impressive 820 variance-covariance and 41 dispersion parameters are estimated simultaneously, let alone the regression parameters. The second is from the Australia Health Survey with 5 response variables and 5190 respondents. These datasets can be considered overdispersed by the generalized dispersion index. The COM-Poisson model overcame the other two competitors considering three goodness-of-fit indexes, AIC, BIC, and maximized log-likelihood values. As a result, it estimated parameters with smaller standard errors and a greater number of significant correlation coefficients. Therefore, the proposed model is capable of dealing with multivariate count data, either under- equi- or overdispersed responses, and measuring any kind of correlation between them taking into account the effects of the covariates.
Read full abstract