Advancing radar-based precipitation nowcasting : an open benchmark and the potential of deep learning

Georgy Ayzel

doi:10.25932/publishup-50426

Abstract

Precipitation forecasting has an important place in everyday life – during the day we may have tens of small talks discussing the likelihood that it will rain this evening or weekend. Should you take an umbrella for a walk? Or should you invite your friends for a barbecue? It will certainly depend on what your weather application shows. While for years people were guided by the precipitation forecasts issued for a particular region or city several times a day, the widespread availability of weather radars allowed us to obtain forecasts at much higher spatiotemporal resolution of minutes in time and hundreds of meters in space. Hence, radar-based precipitation nowcasting, that is, very-short-range forecasting (typically up to 1–3 h), has become an essential technique, also in various professional application contexts, e.g., early warning, sewage control, or agriculture. There are two major components comprising a system for precipitation nowcasting: radar-based precipitation estimates, and models to extrapolate that precipitation to the imminent future. While acknowledging the fundamental importance of radar-based precipitation retrieval for precipitation nowcasts, this thesis focuses only on the model development: the establishment of open and competitive benchmark models, the investigation of the potential of deep learning, and the development of procedures for nowcast errors diagnosis and isolation that can guide model development. The present landscape of computational models for precipitation nowcasting still struggles with the availability of open software implementations that could serve as benchmarks for measuring progress. Focusing on this gap, we have developed and extensively benchmarked a stack of models based on different optical flow algorithms for the tracking step and a set of parsimonious extrapolation procedures based on image warping and advection. We demonstrate that these models provide skillful predictions comparable with or even superior to state-of-the-art operational software. We distribute the corresponding set of models as a software library, rainymotion, which is written in the Python programming language and openly available at GitHub (https://github.com/hydrogo/rainymotion). That way, the library acts as a tool for providing fast, open, and transparent solutions that could serve as a benchmark for further model development and hypothesis testing. One of the promising directions for model development is to challenge the potential of deep learning – a subfield of machine learning that refers to artificial neural networks with deep architectures, which may consist of many computational layers. Deep learning showed promising results in many fields of computer science, such as image and speech recognition, or natural language processing, where it started to dramatically outperform reference methods. The high benefit of using big for training is among the main reasons for that. Hence, the emerging interest in deep learning in atmospheric sciences is also caused and concerted with the increasing availability of data – both observational and model-based. The large archives of weather radar data provide a solid basis for investigation of deep learning potential in precipitation nowcasting: one year of national 5-min composites for Germany comprises around 85 billion data points. To this aim, we present RainNet, a deep convolutional neural network for radar-based precipitation nowcasting. RainNet was trained to predict continuous precipitation intensities at a lead time of 5 min, using several years of quality-controlled weather radar composites provided by the German Weather Service (DWD). That data set covers Germany with a spatial domain of 900 km x 900 km and has a resolution of 1 km in space and 5 min in time. Independent verification experiments were carried out on 11 summer precipitation events from 2016 to 2017. In these experiments, RainNet was applied recursively in order to achieve lead times of up to 1 h. In the verification experiments, trivial Eulerian persistence and a conventional model based on optical flow served as benchmarks. The latter is available in the previously developed rainymotion library. RainNet significantly outperformed the benchmark models at all lead times up to 60 min for the routine verification metrics mean absolute error (MAE) and critical success index (CSI) at intensity thresholds of 0.125, 1, and 5 mm/h. However, rainymotion turned out to be superior in predicting the exceedance of higher intensity thresholds (here 10 and 15 mm/h). The limited ability of RainNet to predict high rainfall intensities is an undesirable property which we attribute to a high level of spatial smoothing introduced by the model. At a lead time of 5 min, an analysis of power spectral density confirmed a significant loss of spectral power at length scales of 16 km and below. Obviously, RainNet had learned an optimal level of smoothing to produce a nowcast at 5 min lead time. In that sense, the loss of spectral power at small scales is informative, too, as it reflects the limits of predictability as a function of spatial scale. Beyond the lead time of 5 min, however, the increasing level of smoothing is a mere artifact – an analogue to numerical diffusion – that is not a property of RainNet itself but of its recursive application. In the context of early warning, the smoothing is particularly unfavorable since pronounced features of intense precipitation tend to get lost over longer lead times. Hence, we propose several options to address this issue in prospective research on model development for precipitation nowcasting, including an adjustment of the loss function for model training, model training for longer lead times, and the prediction of threshold exceedance. The model development together with the verification experiments for both conventional and deep learning model predictions also revealed the need to better understand the source of forecast errors. Understanding the dominant sources of error in specific situations should help in guiding further model improvement. The total error of a precipitation nowcast consists of an error in the predicted location of a precipitation feature and an error in the change of precipitation intensity over lead time. So far, verification measures did not allow to isolate the location error, making it difficult to specifically improve nowcast models with regard to location prediction. To fill this gap, we introduced a framework to directly quantify the location error. To that end, we detect and track scale-invariant precipitation features (corners) in radar images. We then consider these observed tracks as the true reference in order to evaluate the performance (or, inversely, the error) of any model that aims to predict the future location of a precipitation feature. Hence, the location error of a forecast at any lead time ahead of the forecast time corresponds to the Euclidean distance between the observed and the predicted feature location at the corresponding lead time. Based on this framework, we carried out a benchmarking case study using one year worth of weather radar composites of the DWD. We evaluated the performance of four extrapolation models, two of which are based on the linear extrapolation of corner motion; and the remaining two are based on the Dense Inverse Search (DIS) method: motion vectors obtained from DIS are used to predict feature locations by linear and Semi-Lagrangian extrapolation. For all competing models, the mean location error exceeds a distance of 5 km after 60 min, and 10 km after 110 min. At least 25% of all forecasts exceed an error of 5 km after 50 min, and of 10 km after 90 min. Even for the best models in our experiment, at least 5 percent of the forecasts will have a location error of more than 10 km after 45 min. When we relate such errors to application scenarios that are typically suggested for precipitation nowcasting, e.g., early warning, it becomes obvious that location errors matter: the order of magnitude of these errors is about the same as the typical extent of a convective cell. Hence, the uncertainty of precipitation nowcasts at such length scales – just as a result of locational errors – can be substantial already at lead times of less than 1 h. Being able to quantify the location error should hence guide any model development that is targeted towards its minimization. To that aim, we also consider the high potential of using deep learning architectures specific to the assimilation of sequential (track) data. Last but not least, the thesis demonstrates the benefits of a general movement towards open science for model development in the field of precipitation nowcasting. All the presented models and frameworks are distributed as open repositories, thus enhancing transparency and reproducibility of the methodological approach. Furthermore, they are readily available to be used for further research studies, as well as for practical applications.

Full Text