Abstract
In modelling count data, the use of least square regression models suffers several methodological limitations and statistical properties in instances of discrete, non-negative integer count of a dependent variable. Unlike the classical regression model, count data models are non-linear with many properties of the response variable relating to discreteness, non-linearity and deal with non-negative values only. A good starting point for modelling count data is the Poisson regression model since it lends itself well with the nature properties of count data. However, the limitation of equi-dispersion renders it inappropriate for modelling over-dispersed data. Negative Binomial regression model has been widely used and considered as the default regression model for over-dispersed count data. This model is a modification of Poisson regression model and though widely used, it might not be the best model for over-dispersion and other models have been found to perform better. Over-dispersion in this study was defined relative to the Poisson model. This study models over-dispersed count data using discrete Weibull regression model and artificial neural network model with a median neuron in the hidden layer. After fitting the two models on simulated data and real data, the artificial neural network model outperformed the discrete Weibull regression model. Application on data set from German health survey gave RMSE of DW regression model as 69.0668 and 35.5652 for the artificial neural network.
Highlights
Count data is defined as the number of times an event occurs within a period of time that is fixed
This study looked at Discrete Weibull (DW) regression and Artificial neural network (ANN) models
This study evaluates the performance of DW regression with comparison to ANN model
Summary
Count data is defined as the number of times an event occurs within a period of time that is fixed. The use of least square regression models suffers several methodological limitations and statistical properties in instances of discrete, non-negative integer count of a dependent variable [1]. A good starting point for modelling count data is the Poisson regression model since it lends itself well with the nature properties of count data Some examples of such data are the number of road accident deaths, the number of patents awarded to a firm, the number of dengue fever cases which is restricted to a single digit or integer with low number of events and the number of times a doctor visits a patient [2]. Poisson regression model still has one potential problem This is the property of equi-dispersion, that is the assumption of equality of variance and mean. With count data modelling, after the development of Poisson regression model, one proceeds with the analysis of correcting for dispersion if it exists
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Data Science and Analysis
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.