Abstract

Probabilistic prediction models exist to reduce surprise about future events. This paper explores the evaluation of such forecasts when the event of interest is rare. We review how the family of Brier-type scores may be ill-suited to evaluate predictions of rare events, and we offer an alternative to information-theoretical scores such as Ignorance. The reduction in surprise provided by a set of forecasts is represented as information gain, a frequent loss function in machine learning training, meaning the reduction in ignorance over a baseline having received a new forecast. We evaluate predictions of a synthetic dataset of rare events and demonstrate the differences in interpretation of the same datasets depending on whether the Brier or Ignorance score is used. While the two types of scores are broadly similar, there are substantial differences in interpretation at extreme probabilities. Information gain is measured in units of bits, an irreducible unit of information, that allows forecasts of different variables to be comparatively evaluated fairly. Further insight from information-based scores is gained via a similar reliability–discrimination decomposition as found in Brier-type scores. We conclude by crystallising multiple concepts to better equip forecast-system developers and decision-makers with tools to navigate complex trade-offs and uncertainties that characterise meteorological forecasting. To this end, we also provide computer code to reproduce data and figures herein.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call