Abstract

There are distinctive methodological and conceptual challenges in rare and severe event (RSE) forecast-verification, that is, in the assessment of the quality of forecasts involving natural hazards such as avalanches or tornadoes. While some of these challenges have been discussed since the inception of the discipline in the 1880s, there is no consensus about how to assess RSE forecasts. This article offers a comprehensive and critical overview of the many different measures used to capture the quality of an RSE forecast and argues that there is only one proper skill score for RSE forecast-verification. We do so by first focusing on the relationship between accuracy and skill and show why skill is more important than accuracy in the case of RSE forecast-verification. Subsequently, we motivate three adequacy constraints for a proper measure of skill in RSE forecasting. We argue that the Peirce Skill Score is the only score that meets all three adequacy constraints. We then show how our theoretical investigation has important practical implications for avalanche forecasting by discussing a recent study in avalanche forecast-verification using the nearest neighbour method. Lastly, we raise what we call the “scope challenge" that affects all forms of RSE forecasting and highlight how and why the proper skill measure is important not only for local binary RSE forecasts but also for the assessment of different diagnostic tests widely used in avalanche risk management and related operations. Finally, our discussion is also of relevance to the thriving research project of designing methods to assess the quality of regional multi-categorical avalanche forecasts.

Highlights

  • In this paper, we draw on insights from the rich history of tornado forecast-verification to locate important theoretical debates that arise within the context of binary rare and severe event (RSE) forecast-verification

  • We raise what we call the “scope challenge” that affects all forms of RSE forecasting and highlight how and why the proper skill measure is important for local binary RSE forecasts and for the assessment of different diagnostic tests widely used in avalanche risk management and related operations

  • A number of different measures have been used to assess the quality—the goodness of fit—of individual RSE forecasts and to justify comparative judgements about different RSE forecasts, there has not been any consensus about which measure is the most relevant in the context of binary RSE forecasts

Read more

Summary

Introduction

We draw on insights from the rich history of tornado forecast-verification to locate important theoretical debates that arise within the context of binary rare and severe event (RSE) forecast-verification. This article offers a comprehensive and critical overview of the different measures used to assess the quality of an RSE forecast and argues that there really is only one proper skill score for binary RSE forecast-verification. We highlight a wider conceptual challenge for binary rare and severe forecast-verification by consider ing what we call the “scope-problem” We apply this problem to the special case of avalanche forecasting and conclude by highlighting how our results are of relevance to different aspects of avalanche operations and management

Accuracy Paradox: setting the stage
The accuracy paradox: accuracy vs skill
First adequacy constraint
Third adequacy constraint
Application: the relevance of skill scores in avalanche forecast-verification
Accuracy measure: its shortcomings exemplified
The Peirce Skill Score and NN avalanche forecasting
The Heidke Skill Score and NN forecasting
The (ir)relevance of the Success Rate for NN forecasting
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call