Graphical Model Selection for Gaussian Conditional Random Fields in the Presence of Latent Variables

Benjamin Frot,Luke Jostins,Gilean Mcvean

doi:10.1080/01621459.2018.1434531

Abstract

ABSTRACT We consider the problem of learning a conditional Gaussian graphical model in the presence of latent variables. Building on recent advances in this field, we suggest a method that decomposes the parameters of a conditional Markov random field into the sum of a sparse and a low-rank matrix. We derive convergence bounds for this estimator and show that it is well-behaved in the high-dimensional regime as well as “sparsistent” (i.e., capable of recovering the graph structure). We then show how proximal gradient algorithms and semi-definite programming techniques can be employed to fit the model to thousands of variables. Through extensive simulations, we illustrate the conditions required for identifiability and show that there is a wide range of situations in which this model performs significantly better than its counterparts, for example, by accommodating more latent variables. Finally, the suggested method is applied to two datasets comprising individual level data on genetic variants and metabolites levels. We show our results replicate better than alternative approaches and show enriched biological signal. Supplementary materials for this article are available online.

Highlights

The task of performing graphical model selection arises in many applications in science and engineering
We study the properties of the proposed model on synthetic data and compare its performances to the three other methods introduced earlier: the graphical lasso (GLASSO) (Friedman, Hastie, and Tibshirani 2008), the sparse conditional Gaussian graphical model (SCGGM) (Sohn and Kim 2012; Zhang and Kim 2014; Wytock and Kolter 2013) and the low-rank plus sparse decomposition (LR+S) (Chandrasekaran, Parrilo, and Willsky 2012)
By analogy to the Area Under the Curve (AUC) metric, we report the “volume under the surface” (VUS) which accounts for the effect of both regularization parameters

Summary

Introduction

The task of performing graphical model selection arises in many applications in science and engineering. It is common that only a subset of the relevant variables are observed and estimators that do not account for hidden variables are prone to confounding. On the other hand, modeling latent variables is itself difficult because of identifiability and tractability issues. The number of variables being modeled is often greater than the number of samples. It is well known that, in such a scaling regime, obtaining a consistent estimator is usually impossible without making further assumptions about the model, for example, sparsity or low-dimensionality. Modeling the joint distribution over all observed variables is not always relevant. It is sometimes preferable to learn a graphical model over a number of variables of interest while conditioning on the rest of the collection

Objectives

Methods

Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of the American Statistical Association	Publication Date: Jul 11, 2018
Citations: 8	License type: open-access

R Discovery Prime

R Discovery Prime

Graphical Model Selection for Gaussian Conditional Random Fields in the Presence of Latent Variables

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of the American Statistical Association

Lead the way for us

Similar Papers

Estimation of Controlled Direct Effects in the Presence of Exposure-Induced Confounding and Latent Variables
Tom Loeys ... Stijn Vansteelandt
Structural Equation Modeling: A Multidisciplinary Journal | VOL. 21
Tom Loeys, et. al.Tom Loeys ... Stijn Vansteelandt
05 Jun 2014
Structural Equation Modeling: A Multidisciplinary Journal | VOL. 21

Thresholded graphical lasso adjusts for latent variables
Minjie Wang ... Genevera I Allen
Biometrika | VOL. 110
Minjie Wang, et. al.Minjie Wang ... Genevera I Allen
10 Nov 2022
Biometrika | VOL. 110

Building causal graphs from statistical data in the presence of latent variables
Peter Spirtes
Studies in Logic and the Foundations of Mathematics | VOL. 134
Peter SpirtesPeter Spirtes
01 Jan 1995
Studies in Logic and the Foundations of Mathematics | VOL. 134

Estimating linear causality in the presence of latent variables
Nina Fei ... Youlong Yang
Cluster Computing | VOL. 20
Nina Fei, et. al.Nina Fei ... Youlong Yang
25 Mar 2017
Cluster Computing | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Graphical Model Selection for Gaussian Conditional Random Fields in the Presence of Latent Variables

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of the American Statistical Association