Abstract

The risk of drought impacting the drinking water and agricultural production is worrying in the developed countries, especially in a changing climate context. To manage and prevent this phenomenon, real-time monitoring and predictive systems are emerging as the key solutions. In the field of artificial intelligence, neural networks are one of these predictive systems. This family of parameterized models is a composition of neuronal functions, which apply a non-linear transformation from their inputs to their outputs. These networks are able to learn a hydro(geo)logical system behaviour using a database composed of observed inputs (rainfall, evapotranspiration, etc.) and outputs (groundwater level, discharge, etc.), thanks to an algorithm minimizing a cost function between observed and simulated outputs. However, it remains difficult to assess the uncertainty generated by these models, possibly leading to misinterpretations by the end users. These uncertainties are mainly of three types. The first is related to the input data. Indeed, hydrosystems are surface elements whereas meteorological inputs are punctual elements. The interpolation error can, therefore, be significant because of the lack of knowledge between gauging stations. The second is the neural network model architecture itself. It is possible to deal with this source of uncertainty using regularization methods. Finally, the neural networks are submitted to uncertainties related to parameter initialization, before the training step. The initial parameters may have an important impact on the results. In this paper, we address the prediction of the Blavet groundwater level (Bretagne, France). In order to assess uncertainties, we will first focus on the parameters initialization of the model. Neuronal models are optimized using cross-validation and early stopping. Then, an ensemble model is realized, in which each member is the result of a unique set of parameters initialization. The purpose of the study is to define how many initializations are necessary to obtain a reasonable confidence interval for forecasts, with the smallest interval and the higher rate of observed points inside this interval. The best model will be determined using cross-validation scores thereby ensuring optimal robustness. We show that, in this case study, an ensemble model of 20 different initializations is sufficient to estimate uncertainty while preserving quality. In the second part, the resulting ensemble model will be used to estimate the global model uncertainty using probability density functions (pdf) applied to the distribution of groundwater level data and cross-validation scores of forecasts. It reveals that the groundwater level predictions are composed of two mixed distributions. Therefore, we will use the expectation-maximization algorithm (EM) to obtain parameters of mixed models. Mixed normal and mixed Gumbel laws, among five mixed distributions assessed, give the best groundwater distribution and are able to generate an abacus drawing uncertainty of model.

Highlights

  • Water related risks impact a large part of the population

  • In a climate change context, with a rise of extreme phenomena frequency and duration [1], real-time monitoring and predictive systems are emerging as key solutions

  • Even if the SFPI could be enhanced by using more members, the cost-benefit ratio pleads in favour of the 20 members

Read more

Summary

Introduction

Water related risks impact a large part of the population. On the one hand, floods frequently cause fatalities and damages. The first relies on a deep knowledge of the considered system, dedicated to building physically based models This knowledge is most of the time difficult to acquire, leading to lower efficiency and high time and money consuming methods. The second one relies on the relation between system inputs and outputs, never making any strong hypothesis on the system operation, as long as the inputs are able to explain the outputs (such as rainfall for discharge). In the latter solution family, neural networks play an important role as they are known to be able to identify dynamical processes [2]. These uncertainties mainly have three origins: input data (especially noise and spatial variability), model architecture and parameters initialization before the training step [3]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call