Using machine learning to predict convection-allowing ensemble forecast skill: Evaluation with the NSSL Warn-on-Forecast System

Corey K Potvin,Montgomery L Flora,Patrick S Skinner,Anthony E Reinhart,Brian C Matilla

doi:10.1175/aies-d-23-0106.1

Abstract

Abstract Forecasters routinely calibrate their confidence in model forecasts. Ensembles inherently estimate forecast confidence, but are often underdispersive, and ensemble spread does not strongly correlate with ensemble-mean error. The misalignment between ensemble spread and skill motivates new methods for “forecasting forecast skill” so that forecasters can better utilize ensemble guidance. We have trained logistic regression and random forest models to predict the skill of composite reflectivity forecasts from the NSSL Warn-on-Forecast System (WoFS), a 3-km ensemble that generates rapidly updating forecast guidance for 0-6-h lead times. The forecast skill predictions are valid at 1-h, 2-h, or 3-h lead times within localized regions determined by the observed storm locations at analysis time. We use WoFS analysis and forecast output and NSSL Multi-Radar / Multi-Sensor composite reflectivity for 106 cases from the 2017-2021 NOAA Hazardous Weather Testbed Spring Forecasting Experiments. We frame the prediction task as a multi-classification problem, where the forecast skill labels are determined by averaging the extended Fractions Skill Scores (eFSS) for several reflectivity thresholds and verification neighborhoods, then converting to one of three classes based on where the average eFSS ranks within the entire dataset: POOR (bottom 20%), FAIR (middle 60%), or GOOD (top 20%). Initial machine learning (ML) models are trained on 323 predictors; reducing to 10 or 15 predictors in the final models only modestly reduces skill. The final models substantially outperform carefully developed persistence- and spread-based models, and are reasonably explainable. The results suggest that ML can be a valuable tool for guiding user confidence in convection-allowing (and larger-scale) ensemble forecasts.

Full Text