Noncanonical links in generalized linear models – when is the effort justified?

Claudia Czado,Axel Munk

doi:10.1016/s0378-3758(99)00195-0

Abstract

Abstract Generalized linear models (GLMs) allow for a wide range of statistical models for regression data. In particular, the logistic model is usually applied for binomial observations. Canonical links for GLMs such as the logit link in the binomial case, are often used because in this case minimal sufficient statistics for the regression parameter exist which allow for simple interpretation of the results. However, in some applications, the overall fit as measured by the p-values of goodness-of-fit statistics (as the residual deviance) can be improved significantly by the use of a noncanonical link. In this case, the interpretation of the influence of the covariables is more complicated compared to GLMs with canonical link functions. It will be illustrated through simulation that the p-value associated with the common goodness-of-link tests is not appropriate to quantify the changes to mean response estimates and other quantities of interest when switching to a noncanonical link. In particular, the rate of misspecifications becomes considerably large, when the inverse information value associated with the underlying parametric link model increases. This shows that the classical tests are often too sensitive, in particular, when the number of observations is large. The consideration of a generalized p-value function is proposed instead, which allows the exact quantification of a suitable distance to the canonical model at a controlled error rate. Corresponding tests for validating or discriminating the canonical model can easily performed by means of this function. Finally, it is indicated how this method can be applied to the problem of overdispersion.

Full Text