Abstract

We introduce a class of scalable Bayesian hierarchical models for the analysis of massive geostatistical datasets. The underlying idea combines ideas on high-dimensional geostatistics by partitioning the spatial domain and modeling the regions in the partition using a sparsity-inducing directed acyclic graph (DAG). We extend the model over the DAG to a well-defined spatial process, which we call the meshed Gaussian process (MGP). A major contribution is the development of an MGPs on tessellated domains, accompanied by a Gibbs sampler for the efficient recovery of spatial random effects. In particular, the cubic MGP (Q-MGP) can harness high-performance computing resources by executing all large-scale operations in parallel within the Gibbs sampler, improving mixing and computing time compared to sequential updating schemes. Unlike some existing models for large spatial data, a Q-MGP facilitates massive caching of expensive matrix operations, making it particularly apt in dealing with spatiotemporal remote-sensing data. We compare Q-MGPs with large synthetic and real world data against state-of-the-art methods. We also illustrate using Normalized Difference Vegetation Index data from the Serengeti park region to recover latent multivariate spatiotemporal random effects at millions of locations. The source code is available at github.com/mkln/meshgp. Supplementary materials for this article are available online.

Highlights

  • Collecting large quantities of spatial and spatiotemporal data is commonplace in many fields

  • Our focus here is in developing tessellated Gaussian processes (GPs) as a methodology that enables the efficient recovery of the latent spatial random effects and the Bayesian estimation of covariance parameters via MCMC; we are not focusing on alternative computational algorithms, which have been developed for nearest-neighbor Gaussian process (NNGP) but can all be adapted to general meshed Gaussian process (MGP) models

  • Q-MGPs automatically adjust to settings where observed locations T are on partly regular lattices, that is, they are located at patterns repeating in space or time which emerge after initial inspections of the data

Read more

Summary

Introduction

Collecting large quantities of spatial and spatiotemporal data is commonplace in many fields. Central to tessellated GPs is the idea of forcing a DAG with known coloring on the data, resulting in guaranteed efficiencies when recovering the latent spatial effects This strategy is analogous in spirit to multi-resolution approximations (Gramacy and Lee 2008; Katzfuss 2017), which force a DAG on the data, resulting in conditional independence patterns that are known in advance and that can be used to improve computations. Our focus here is in developing tessellated GPs as a methodology that enables the efficient recovery of the latent spatial random effects and the Bayesian estimation of covariance parameters via MCMC; we are not focusing on alternative computational algorithms (see, e.g., Finley et al 2019), which have been developed for NNGPs but can all be adapted to general MGP models. Supplementary materials accompanying this article as an appendix are available online and contain further comparisons of Q-MGPs with several stateof-the-art methods for spatial data

Spatial Processes on Partitioned Domains
Meshed Gaussian Processes
Bayesian Hierarchical Model and Gibbs Sampler
Nonseparable Multivariate Spatiotemporal Covariances
MGPs Based on Domain Tessellation or Tiling
Caching Redundant Expensive Matrix Operations
Improved Mixing via Parallel Sampling
Data Analysis
Synthetic Data
NDVI Data From the Serengeti Ecosystem
Discussion
A Spatial Meshed Process
B Meshed Gaussian Process
D Application
Synthetic data
Spatial multivariate analysis of NDVI in the Serengeti region
G Caching algorithm
H Compute time comparisons with NNGPs
Findings
Effective sample size of MCMC posterior samples
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call