Abstract

Ron Hocking's own work on regression has played such an important role in the development of regression methods that it is very fitting for him to have written this review of the last quarter-century of advances. His review paper on selection methods in Biometrics (Hocking 1976) got me interested in that problem and possibly in regression in general, and that paper, like the current one, is exemplary. Both survey an area, but still leave many questions unanswered. In fitting linear regression models we make many assumptions such as linearity, constant variance, and perhaps normality. We assume that relevant variables are measured, and that these do not need to be transformed. As Hocking has pointed out, the precomputer approach of taking the assumptions as given and correct is no longer accepted statistical practice. Methods for criticism of assumptions and of influence analysis have now become standard, as clearly indicated by the proportion of Hocking's review that is dedicated to such methods. Hocking does note, however, that the array of such techniques that are available to the analyst is large, so the choice of appropriate and useful measures is not always clear. The confusion has several sources. The whole methodology of regression criticism has developed very quickly. Before the last decade, the most common tools for criticism were plots of residuals against various quantities such as fitted values, and probability plots. Each of these was intended to serve a number of purposes, providing information on outliers, linearity, heteroscedasticity, the need to transform, and perhaps some notion of influence, depending on the pattern in the plot. On the other hand, the recently developed or rediscovered methods for criticism seem to address specific issues, and each of these methods may require computation of statistics useful for that one method only. At the same time, methods have been developed that are probably not generally useful, but many regression analysts are not sufficiently knowledgeable to tell the good ones from the not-so-good ones. My purpose in these remarks is to present some guidelines for developing, and using, methods for regression criticism. I address separately what I call diagnostics (model criticism) and influence analysis (data criticism). Before proceeding, it may be well to point out that not everyone agrees with the importance of these methods. Some think that the methods do little more than allow analysts to make quick but superficial decisions concerning data. I obviously do not agree with this view and have found that intelligent application of these methods can be very useful in practice. In any case, the discussion following Atkinson (1982) is illuminating on this issue.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call