Abstract

Model diagnostics and forecast evaluation are closely related tasks, with the former concerning in-sample goodness (or lack) of fit and the latter addressing predictive performance out-of-sample. We review the ubiquitous setting in which forecasts are cast in the form of quantiles or quantile-bounded prediction intervals. We distinguish unconditional calibration, which corresponds to classical coverage criteria, from the stronger notion of conditional calibration, as can be visualized in quantile reliability diagrams. Consistent scoring functions—including, but not limited to, the widely used asymmetricpiecewise linear score or pinball loss—provide for comparative assessment and ranking, and link to the coefficient of determination and skill scores. We illustrate the use of these tools on Engel's food expenditure data, the Global Energy Forecasting Competition 2014, and the US COVID-19 Forecast Hub.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call