Abstract
Statistical models create a basis for analysis of infectious disesase count. These data sets exhibit unique characteristics such as low counts, delayed reporting, underreporting amoung others. The tendency to model these counts using linear models with their simplicity is common with most research. Further, the assumption of a fixed dispersion in modeling infectious disease counts is quite high. Prediction relating to infectious disease counts have been based on the Poisson model framework. The extension of the Poisson models such NB and PIG distributions have gained popularity over the recent past in modeling count responses showing over dispersion relative to the Poisson distribution. In this study we propose non-linear models for these data sets, modeling the mean and dispersion parameters as additive terms. Negative Binomial (NB) and Poisson Inverse Gaussian (PIG) glm models with a fixed and a varying dispersion parameter and compare them with NB GAM and PIG GAM with both mean and dispersion modeled as additive terms. The model are fitted to over dispersed infectious counts, Salmonella Hadar data set. Residual plots are constructed to explore the quality of fits and analysis goodness of fit is carried out to access the best fitting model. The study results reveal better performance of the PIG models on both the linear and non linear model platforms. Further, modelling both the mean and dispersion proved better as compared to models assuming the dispersion as a constant.
Highlights
Count data is encountered on daily basis and dealings
General additive models of the Poisson Inverse Gaussian (PIG) and Negative Binomial (NB) distribution were fitted to infectious disease counts
The linear model fits for counts indicated that PIG glm models with varying dispersion parameter had better performance in fitting the data as opposed to the other fits
Summary
Count data is encountered on daily basis and dealings. The data exhibits unique characteristics such as over-dispersion, under-dispersion, incompleteness, presence of excess zeros among others. In the negative binomial (NB) distribution and Generalized Poisson (GP) distributions [6], the models enable independent modelling of both, mean and variance by the incorporation of an additional parameter They enable additional variation within the data to be accounted for by adding a randomly distributed error term, that is based on the Gamma distribution. The special form of the SI distribution; the Poisson inverse Gaussian (PIG) in which the shape parameter in the SI model is set to -0.5 provides a better model for analysis of count data with slightly longer tails and excess kurtosis [39]. A regression model using alternative parametrization of the PIG distribution, where the shape parameter is orthogonal to the mean should be considered [21] This parametrization of and leads to model estimates that are robust to misspecification of the dispersion model. The study propose non-linear parameterization of the PIG model in location and dispersion parameters
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Data Science and Analysis
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.