Abstract

We introduce Gaussian orthogonal latent factor processes for modeling and predicting large correlated data. To handle the computational challenge, we first decompose the likelihood function of the Gaussian random field with a multi-dimensional input domain into a product of densities at the orthogonal components with lower-dimensional inputs. The continuous-time Kalman filter is implemented to compute the likelihood function efficiently without making approximations. We also show that the posterior distribution of the factor processes is independent, as a consequence of prior independence of factor processes and orthogonal factor loading matrix. For studies with large sample sizes, we propose a flexible way to model the mean, and we derive the marginal posterior distribution to solve identifiability issues in sampling these parameters. Both simulated and real data applications confirm the outstanding performance of this method.

Highlights

  • Large spatial, spatio-temporal, and functional data are commonly used in various studies, including geological hazard quantification, engineering, and medical imaging, to facilitate scientific discoveries

  • In Supplementary Materials, we provide diagnostic plots of the fitted values from the Gaussian orthogonal latent factor (GOLF) model and predictive performance based on several configurations, including 40,000 Markov Chain Monte Carlo (MCMC) samples and different initial parameters

  • The computational time of GOLF per one MCMC iteration is around 0.49 s for this example, which is comparable to NNGP (0.53 s) and faster than MRA (3.29 s) for one iteration

Read more

Summary

Introduction

Spatio-temporal, and functional data are commonly used in various studies, including geological hazard quantification, engineering, and medical imaging, to facilitate scientific discoveries. Gaussian processes (GPs) are widely used for modeling correlated data (Banerjee et al, 2014; Cressie and Cassie, 1993). The computational bottleneck prevents modeling a large correlated data set by GPs directly. Tremendous efforts have been made to approximate a GP model in recent studies, including, for example, stochastic partial differential equation approach (Lindgren et al, 2011; Rue et al, 2009), hierarchical nearest neighbor methods (Datta et al, 2016), multi-resolution process (Katzfuss, 2017), local Gaussian process approach (Gramacy and Apley, 2015), periodic embedding (Guinness and Fuentes, 2017; Stroud et al, 2017) and covariance tapering (Kaufman et al, 2008), which have obtained wide attention in recent years

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.