Abstract Probabilistic forecasts derived from ensemble prediction systems (EPSs) have become the standard basis for many products and services produced by modern operational forecasting centers. However, statistical postprocessing is generally required to ensure forecasts have the desired properties expected for probability-based outputs. Precipitation, a core component of any operational forecast, is particularly challenging to calibrate due to its discontinuous nature and the extreme skew in rainfall amounts. A skillful forecasting system must maintain accuracy for low-to-moderate precipitation amounts, but preserve resolvability for high-to-extreme rainfall amounts, which, though rare, are important to forecast accurately in the interest of public safety. Existing statistical and machine learning approaches to rainfall calibration address this problem, but each has drawbacks in design, training approaches, and/or performance. We describe RainForests, a machine learning approach for calibrating ensemble rainfall forecasts using gradient-boosted decision trees. The model is based on the ecPoint system recently developed at ECMWF by Hewson and Pillosu, but uses machine learning models in place of the semisubjective decision trees of ecPoint, along with some other improvements to the model structure. We evaluate RainForests on the Australian domain against some simple benchmarks and show that it outperforms standard calibration approaches both in overall skill and in accurately forecasting high rainfall conditions, while being computationally efficient enough to be used in an operational forecasting system. Significance Statement Weather forecasting typically uses physical models to simulate the atmosphere and produce a set of scenarios, called an ensemble, for each forecast time. Forecast calibration is a set of statistical techniques for using these ensemble forecasts to obtain probability forecasts of weather variables. For example, on a given day, we may predict that there are a 90% chance of at least 1 mm of rain and a 50% chance of at least 3 mm. We introduce RainForests, a new machine learning method for calibrating ensemble forecasts, which uses ensemble forecasts of rainfall and other related variables to produce more accurate probability forecasts of rainfall amounts in operational forecasting settings.
Read full abstract