Parallel Approximations of the Tukey g-and-h Likelihoods and Predictions for Non-Gaussian Geostatistics

Sagnik Mondal,David E Keyes,Hatem Ltaief,Sameh Abdulah,Ying Sun,Marc G Genton

doi:10.1109/ipdps53621.2022.00044

Abstract

Maximum likelihood estimation is an essential tool in the procedure to impute missing data in climate/weather applications. By defining a particular statistical model, the maximum likelihood estimation can be used to understand the underlying structure of given geospatial data. The Gaussian random field has been widely used to describe geospatial data, as one of the most popular models under the hood of maximum likelihood estimation. Computation of Gaussian log-likelihood demands operations on a dense symmetric positive definite matrix, often parameterized by the Matérn correlation function. This computation of the log-likelihood requires <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathcal{O}(n^{2})$</tex> storage and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathcal{O}(n^{3})$</tex> operations, which can be a huge task considering that the number of geographical locations, <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$n$</tex> , now commonly reaches into the millions. However, despite its appealing theoretical properties, the assumptions of Gaussianity may be unrealistic since real data often show signs of skewness or have some extreme values. Herein, we consider the Tukey <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">${g-}$</tex> and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$-h$</tex> (TGH) random field as an example of a non-Gaussian random field that shows more robustness in modeling geospatial data by including two more parameters to incorporate skewness and heavy tail features in the model. This work provides the first HPC implementation of the TGH random field's inference on parallel hardware architectures. Using task-based programming models associated with dynamic runtime systems, our implementation leverages the high concurrency of current parallel systems. This permits to run the exact log-likelihood evaluation of the Tukey g-and-h (TGH) random fields for a decent number of geospatial locations. To tackle large-scale problems, we provide additionally an implementation of the given model using two different low-rank approximations. We compress the aforementioned positive-definite symmetric matrix for computing the log-likelihood and rely on the Tile Low-Rank (TLR) and the Hierarchical Off-Diagonal Low-Rank (HODLR) matrix approximations. We assess the performance and accuracy of the proposed implementations using synthetic datasets up to <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$800K$</tex> and a <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$300K$</tex> precipitation data of Germany to demonstrate the advantage of using non-Gaussian over Gaussian random fields. Moreover, by relying on TLR/HODLR matrix computations, we can now solve for larger matrix sizes while preserving the required accuracy for prediction. We show the performance superiority of TLR over HODLR matrix computations when calculating the TGH likelihoods and predictions. Our TLR-based approximation shows a speedup up to <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$7.29X$</tex> and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$2.96X$</tex> on shared-memory and distributed-memory systems, respectively, compared to the exact implementation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parallel Approximations of the Tukey g-and-h Likelihoods and Predictions for Non-Gaussian Geostatistics

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Tile low-rank approximations of non-Gaussian space and space-time Tukey g-and-h random field likelihoods and predictions on large-scale systems
Sagnik Mondal ... David E Keyes
Journal of Parallel and Distributed Computing | VOL. 180
Sagnik Mondal, et. al.Sagnik Mondal ... David E Keyes
19 May 2023
Journal of Parallel and Distributed Computing | VOL. 180

Large deviations and exponential decay for the magnetization in a Gaussian random field
René A Carmona ... Lin Xu
Probability Theory and Related Fields | VOL. 106
René A Carmona, et. al.René A Carmona ... Lin Xu
01 Oct 1996
Probability Theory and Related Fields | VOL. 106

Chapter 9 - Gaussian delta-correlated random field (causal integral equations)
V.I. Klyatskin
Stochastic Equations through the Eye of the Physicist | VOL. -
V.I. KlyatskinV.I. Klyatskin
01 Jan 2004
Stochastic Equations through the Eye of the Physicist | VOL. -

Properties of local-nondeterminism of Gaussian and stable random fields and their applications
Yimin Xiao
Annales de la Faculté des sciences de Toulouse : Mathématiques | VOL. 15
Yimin XiaoYimin Xiao
01 Jan 2006
Annales de la Faculté des sciences de Toulouse : Mathématiques | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parallel Approximations of the Tukey g-and-h Likelihoods and Predictions for Non-Gaussian Geostatistics

Abstract

Talk to us

Similar Papers