Abstract The implications of definitions of excessive rainfall observations on machine learning model forecast skill are assessed using the Colorado State University Machine Learning Probabilities (CSU-MLP) forecast system. The CSU-MLP uses historical observations along with reforecasts from a global ensemble to train random forests to probabilistically predict excessive rainfall events. Here, random forest models are trained using two distinct rainfall datasets, one that is composed of fixed-frequency (FF) average recurrence intervals exceedances and flash flood reports and the other a compilation of flooding and rainfall proxies [Unified Flood Verification System (UFVS)]. Both models generate 1–3-day forecasts and are evaluated against a climatological baseline to characterize their overall skill as a function of lead time, season, and region. Model comparisons suggest that regional frequencies in excessive rainfall observations contribute to when and where the ML models issue forecasts and subsequently their skill and reliability. Additionally, the spatiotemporal distribution of observations has implications for ML model training requirements, notably, how long of an observational record is needed to obtain skillful forecasts. Experiments reveal that shorter-trained UFVS-based models can be as skillful as longer-trained FF-based models. In essence, the UFVS dataset exhibits a more robust characterization of excessive rainfall and impacts, and machine learning models trained on more representative datasets of meteorological hazards may not require as extensive training to generate skillful forecasts. Significance Statement Machine learning (ML) models have shown significant promise in recent years when used to predict high-impact weather hazards. Here, we explore two similarly trained ML models tasked with predicting excessive rainfall, but they use datasets that define excessive rainfall differently. We explore how definitions of excessive rainfall, for example, an amount of rainfall that would be expected to fall once per year, contribute to forecast skill through where these observations are reported. Generally, we find that the two models have substantial skill relative to climatology out to 3 days, but skill varies by geographical region and season in part because of the distribution of observations geographically. These results suggest that careful attention should be paid to how ML models are trained to predict meteorological hazards like excessive rainfall.
Read full abstract