Abstract

In modelling count data, the use of least square regression models suffers several methodological limitations and statistical properties in instances of discrete, non-negative integer count of a dependent variable. Unlike the classical regression model, count data models are non-linear with many properties of the response variable relating to discreteness, non-linearity and deal with non-negative values only. A good starting point for modelling count data is the Poisson regression model since it lends itself well with the nature properties of count data. However, the limitation of equi-dispersion renders it inappropriate for modelling over-dispersed data. Negative Binomial regression model has been widely used and considered as the default regression model for over-dispersed count data. This model is a modification of Poisson regression model and though widely used, it might not be the best model for over-dispersion and other models have been found to perform better. Over-dispersion in this study was defined relative to the Poisson model. This study models over-dispersed count data using discrete Weibull regression model and artificial neural network model with a median neuron in the hidden layer. After fitting the two models on simulated data and real data, the artificial neural network model outperformed the discrete Weibull regression model. Application on data set from German health survey gave RMSE of DW regression model as 69.0668 and 35.5652 for the artificial neural network.

Highlights

  • Count data is defined as the number of times an event occurs within a period of time that is fixed

  • This study looked at Discrete Weibull (DW) regression and Artificial neural network (ANN) models

  • This study evaluates the performance of DW regression with comparison to ANN model

Read more

Summary

Introduction

Count data is defined as the number of times an event occurs within a period of time that is fixed. The use of least square regression models suffers several methodological limitations and statistical properties in instances of discrete, non-negative integer count of a dependent variable [1]. A good starting point for modelling count data is the Poisson regression model since it lends itself well with the nature properties of count data Some examples of such data are the number of road accident deaths, the number of patents awarded to a firm, the number of dengue fever cases which is restricted to a single digit or integer with low number of events and the number of times a doctor visits a patient [2]. Poisson regression model still has one potential problem This is the property of equi-dispersion, that is the assumption of equality of variance and mean. With count data modelling, after the development of Poisson regression model, one proceeds with the analysis of correcting for dispersion if it exists

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call