Abstract

Assessment and Propagation of Model Uncertainty By DAVID DRAPER* University of Bath, UK SUMMARY In most examples of inference and prediction, the expression of uncertainty about unknown quantities y on the basis of known quantities x is based on a model M that formalizes assumptions about how x and y are related. M will typically have two parts: structural assumptions S, such as the form of the link function and the choice of error distribution in a generalized linear model, and parameters 6 whose meaning is specific to a given choice of S. It is common in statistical theory and practice to acknowledge parametric uncertainty about 9 given a particular assumed structure S; it is less common to acknowledge structural uncertainty about S itself. A widely used approach, in fact, involves enlisting the aid of x to specify a plausible single best choice S* for S, and then proceeding as if S* were known to be correct. In general this approach fails to fully assess and propagate structural uncertainty, and may lead to miscali- brated uncertainty assessments about y given x. When miscalibration occurs it will often be in the direction of understatement of inferential or predictive uncertainty about y, leading to inaccurate scientific summaries and overconfi- dent decisions that do not incorporate sufficient hedging against uncertainty. In this paper I discuss a Bayesian approach to solving this problem that has long been available in principle but is only now becoming routinely feasible, by virtue of recent computational advances, and examine its implementation in examples that involve forecasting the price of oil and estimating the chance of catastrophic failure of the U.S. Space Shuttle. Keywords: BAYES FACTORS; CALIBRATION; FORECASTING; HIERARCHICAL MODELS; INFERENCE; MODEL SPECIFICATION; OVER-FITTING; PREDICTION; ROBUSTNESS; SENSITIVITY ANALYSIS; UNCERTAINTY ASSESSMENT 1. INTRODUCTION The general framework of problems in inference and prediction involves two sets of ingredients: unknown (s) y—such as the causal effect of a treatment in inference, or the price of something next year in prediction—and known (s) x, which will typically include both data and context. The desire is usually to express uncertainty about y in light of x, for instance through a probability specification of the form p(y\x). Specifications of this type that involve conditioning only on things that are known are rare, even in comparatively simple settings (e.g., Lindley, 1982); instead one typically appeals to a model M that formalizes judgments about how x and y are related. 1.1. Structural Uncertainty The model may be expressed (e.g., Draper et al., 1987; Hodges, 1987) in two parts as M = (S, 9), where S represents one or more sets of structural assumptions— such as a particular link function in a generalized linear model, or a particular form of heteroscedasticity or time dependence with non-IID data—and 9 represents parameters whose meaning is specific to the chosen structure(s). (It will often be possible to express a given model M in more than one way using this notation, but that does not affect the discussion that follows.) Once S is chosen, 9 typically follows ' Address for correspondence: Statistics Group, School of Mathematical Sciences, University of Bath, Claverton Down, Bath BA2 7AY, UK (d.draper@maths.bath.ac.uk).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call