Abstract

We introduce a Bayesian non-parametric spatial factor analysis model with spatial dependency induced through a prior on factor loadings. For each column of the loadings matrix, spatial dependency is encoded using a probit stick-breaking process (PSBP) and a multiplicative gamma process shrinkage prior is used across columns to adaptively determine the number of latent factors. By encoding spatial information into the loadings matrix, meaningful factors are learned that respect the observed neighborhood dependencies, making them useful for assessing rates over space. Furthermore, the spatial PSBP prior can be used for clustering temporal trends, allowing users to identify regions within the spatial domain with similar temporal trajectories, an important task in many applied settings. In the manuscript, we illustrate the model’s performance in simulated data, but also in two real-world examples: longitudinal monitoring of glaucoma and malaria surveillance across the Peruvian Amazon. The R package spBFA, available on CRAN, implements the method.

Highlights

  • The covariance for the standard Bayesian factor model, Ψ = ΛΛ + Σ, is a matrix decomposition, constructed to learn a latent representation for some potentially highdimensional data object Yt = {Yt(s1), . . . , Yt(sm)}

  • We focus on the class of spatial probit stickbreaking process (PSBP) models, where each column of the factor loadings matrix has the following form, Mj = {Gij,o : si,o ∈ D}, where each column progressively shrinks due to the gamma process shrinkage prior on the atoms

  • The most clear conclusion is that Models 1 and 3, the models that have the spatial PSBP (Model 3 loses the multiplicative gamma process shrinkage prior), perform the best across all settings

Read more

Summary

Introduction

The covariance for the standard Bayesian factor model, Ψ = ΛΛ + Σ, is a matrix decomposition, constructed to learn a latent representation for some potentially highdimensional data object Yt = {Yt(s1), . We use notation from the spatial statistics literature to indicate the dimension of Yt, this is only for consistency throughout the remainder of the paper. The data object Yt is often not spatial in nature, but a vector that contains a large number of highly collinear variables. Throughout this paper, we refer to this dimension as the “variable dimension” of the data. The subscript t describes observed repetitions of the data object and can be inherently independent, spatial, or temporal in nature; we refer to this data dimension as the “replication dimension”

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.