Abstract
BackgroundTwo characteristics of commonly used outcomes in medical research are zero inflation and non-negative integers; examples include the number of hospital admissions or emergency department visits, where the majority of patients will have zero counts. Zero-inflated regression models were devised to analyze this type of data. However, the performance of zero-inflated regression models or the properties of data best suited for these analyses have not been thoroughly investigated.MethodsWe conducted a simulation study to evaluate the performance of two generalized linear models, negative binomial and zero-inflated negative binomial, for analyzing zero-inflated count data. Simulation scenarios assumed a randomized controlled trial design and varied the true underlying distribution, sample size, and rate of zero inflation. We compared the models in terms of bias, mean squared error, and coverage. Additionally, we used logistic regression to determine which data properties are most important for predicting the best-fitting model.ResultsWe first found that, regardless of the rate of zero inflation, there was little difference between the conventional negative binomial and its zero-inflated counterpart in terms of bias of the marginal treatment group coefficient. Second, even when the outcome was simulated from a zero-inflated distribution, a negative binomial model was favored above its ZI counterpart in terms of the Akaike Information Criterion. Third, the mean and skewness of the non-zero part of the data were stronger predictors of model preference than the percentage of zero counts. These results were not affected by the sample size, which ranged from 60 to 800.ConclusionsWe recommend that the rate of zero inflation and overdispersion in the outcome should not be the sole and main justification for choosing zero-inflated regression models. Investigators should also consider other data characteristics when choosing a model for count data. In addition, if the performance of the NB and ZINB regression models is reasonably comparable even with ZI outcomes, we advocate the use of the NB regression model due to its clear and straightforward interpretation of the results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.