PUBLICATIONS Water Resources Research RESEARCH ARTICLE 10.1002/2014WR016395 Key Points: Joint Bayesian inference of Gaussian conductivity fields and their variograms A dimensionality reduction that systematically honors the underlying variogram Distributed multiprocessor implementation is straightforward Correspondence to: E. Laloy, elaloy@sckcen.be Citation: Laloy, E., N. Linde, D. Jacques, and J. A. Vrugt (2015), Probabilistic inference of multi-Gaussian fields from indirect hydrological data using circulant embedding and dimensionality reduction, Water Resour. Res., 51, 4224–4243, doi:10.1002/2014WR016395. Received 10 SEP 2014 Accepted 14 MAY 2015 Accepted article online 19 MAY 2015 Published online 12 JUN 2015 Probabilistic inference of multi-Gaussian fields from indirect hydrological data using circulant embedding and dimensionality reduction Eric Laloy 1,2 , Niklas Linde 3 , Diederik Jacques 1 , and Jasper A. Vrugt 2,4 Institute for Environment, Health and Safety, Belgian Nuclear Research Centre, Mol, Belgium, 2 Department of Civil and Environmental Engineering, University of California, Irvine, California, USA, 3 Applied and Environmental Geophysics Group, Institute of Earth Sciences, University of Lausanne, Lausanne, Switzerland, 4 Department of Earth Systems Science, University of California, Irvine, California, USA Abstract We present a Bayesian inversion method for the joint inference of high-dimensional multi- Gaussian hydraulic conductivity fields and associated geostatistical parameters from indirect hydrological data. We combine Gaussian process generation via circulant embedding to decouple the variogram from grid cell specific values, with dimensionality reduction by interpolation to enable Markov chain Monte Carlo (MCMC) simulation. Using the Mat ern variogram model, this formulation allows inferring the conductivity values simultaneously with the field smoothness (also called Mat ern shape parameter) and other geostatisti- cal parameters such as the mean, sill, integral scales and anisotropy direction(s) and ratio(s). The proposed dimensionality reduction method systematically honors the underlying variogram and is demonstrated to achieve better performance than the Karhunen-Loe`ve expansion. We illustrate our inversion approach using synthetic (error corrupted) data from a tracer experiment in a fairly heterogeneous 10,000-dimensional 2-D conductivity field. A 40-times reduction of the size of the parameter space did not prevent the posterior simulations to appropriately fit the measurement data and the posterior parameter distributions to include the true geostatistical parameter values. Overall, the posterior field realizations covered a wide range of geostatistical models, questioning the common practice of assuming a fixed variogram prior to inference of the hydraulic conductivity values. Our method is shown to be more efficient than sequential Gibbs sampling (SGS) for the considered case study, particularly when implemented on a distributed computing cluster. It is also found to outperform the method of anchored distributions (MAD) for the same computational budget. 1. Introduction High-parameter dimensionality poses considerable challenges for the inversion of groundwater flow and transport data [e.g., Kitanidis, 1995; Hendricks-Franssen et al., 2009; Laloy et al., 2013; Zhou et al., 2014, and references therein]. What is more, conceptual and structural inadequacies of the subsurface model and measurement errors of the model input (boundary conditions) and output (calibration) data introduce uncertainty in the estimated parameters and model simulations. Another important source of uncertainty originates from sparse data coverages that rarely contain sufficient information to uniquely characterize the subsurface at a spatial resolution deemed necessary for accurate modeling. This results in an ill-posed inverse problem with many different sets of model parameter values that fit the data acceptably well. Inver- sion methods should consider this inherent uncertainty and provide an ensemble of model realizations that accurately span the range of possible models that honor the available calibration data and prior information. C 2015. American Geophysical Union. V All Rights Reserved. LALOY ET AL. Hydraulic conductivity (K) fields are typically assumed to be stationary and log-normally distributed with a spatial structure determined by a two-point geostatistical model or variogram [e.g., Rubin, 2003]. Unfortu- nately, a lack of (sufficient) point K measurements (if any) makes it difficult to estimate directly the geostatis- tical parameters (mean, sill, variogram model, integral scales and anisotropy factors) from variographic analysis [Ortiz and Deutsch, 2002; Nowak et al., 2010]. Simultaneous (inverse) inference of conductivity values and associated geostatistical parameters is therefore attractive yet computationally challenging. Indeed, only a few studies can be found in the literature that have attempted simultaneous estimation using global and probabilistic search methods. For example, Jafarpour and Tarrahi [2011] used the Ensemble Kalman JOINT FIELD/VARIOGRAM MCMC INVERSION
Read full abstract