Policy formulation and planning in many disciplines depends on mathematical modelling of empirical data. Linear regressions are common. However, data may vary substantially from one situation to another. This paper does not focus on statistical fluctuation. Rather, data at one time (e.g. prior to the 2019 Covid outbreak) may differ in important ways from data at a later time (e.g. post-Covid). Likewise, data from one location may differ substantively from data elsewhere with a different population, culture, milieu, etc. This challenge introduces data-uncertainty for which probabilistic models are unavailable or require assumptions that may be unjustified. This paper uses non-probabilistic info-gap models of uncertainty to represent data-uncertainty, and the info-gap concept of robustness to uncertainty as the basis for choosing between alternative realizations of a linear regression of empirical data. We demonstrate three properties of info-gap robustness functions when predicting an outcome variable of interest: zeroing, trade off and preference reversal. We also demonstrate the potential utility of excluding selected dependent variables. This analysis supports the use of linear regressions for policy analysis, planning, and decision making. We illustrate the analysis with an epidemiological example.
Read full abstract