Abstract

SUMMARY The use of nonparametric regression is explored to check the fit of a parametric regression model. The principal aim is to check the validity of the regression curve rather than necessarily to detect outliers. A pseudo likelihood ratio test is developed to provide a global assessment of fit and simulation bands are used to indicate the nature of departures from the model. The types of data considered include discrete response variables, where standard diagnostic techniques are often not appropriate, and first-order autoregressive series. Several numerical examples are given. Nonparametric regression can be used in an informal graphical way to assess the relationship between a response and an explanatory variable. In this paper we aim to develop more formal methods of assessing the assumptions of a parametric model, in particular when regression diagnostics of the type developed for normal linear models are not readily available. The principal aim is to check the validity of the systematic part of the model by comparing a nonparametric estimate of the regression curve with a parametric one. Such a comparison may also identify outliers, although the distinction between outliers and model inadequacy is not always easy. Two techniques are used to assess the fit of a parametric model. In ? 2, confidence bands are constructed around the fitted regression curve by simulation. A comparison of these with the nonparametric curve gives an indication of the nature of any departures from the model. In ? 3, a pseudo likelihood ratio test is developed. This provides a quantitative global assessment of fit. In applying these ideas, special emphasis is given to discrete data, and notably logistic regression, because of the difficulty in applying standard residual-based model checking techniques to this type of response variable. A Poisson regression example is discussed in ? 4. However, the underlying ideas have wider applications. Autoregressive time series of order 1 are discussed in ? 6. Sections 5 and 7 discuss general issues. We first discuss the context of binary regression with a single covariate and the difficulties caused by the discreteness of the response variable. The observed data are assumed to be of the form (xi, yi, ni), where xi is a covariate value, and yi has a binomial

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call