Abstract

In the context of Gaussian Graphical Models (GGMs) with high-dimensional small sample data, we present a simple procedure, called PACOSE – standing for PArtial COrrelation SElection – to estimate partial correlations under the constraint that some of them are strictly zero. This method can also be extended to covariance selection. If the goal is to estimate a GGM, our new procedure can be applied to re-estimate the partial correlations after a first graph has been estimated in the hope to improve the estimation of non-zero coefficients. This iterated version of PACOSE is called iPACOSE. In a simulation study, we compare PACOSE to existing methods and show that the re-estimated partial correlation coefficients may be closer to the real values in important cases. Plus, we show on simulated and real data that iPACOSE shows very interesting properties with regards to sensitivity, positive predictive value and stability.

Highlights

  • The robust estimation of the inverse covariance matrix is crucial in many multivariate statistical methods such as discriminant analysis or linear regression [1]

  • Many covariance selection methods have been proposed in the literature, none of these methods is designed to estimate the partial correlation matrix in high-dimensional settings while incorporating a non-decomposable independence graph

  • In the context of systems biology, the estimation of Gaussian Graphical Models (GGMs) is very often characterized by a lower number of individuals (n) or measures than the number of variables (p): In this n%p situation, regularization techniques are mandatory to enable the estimation of GGMs

Read more

Summary

Introduction

The robust estimation of the inverse covariance matrix is crucial in many multivariate statistical methods such as discriminant analysis or linear regression [1]. A large body of literature is devoted to the estimation of the inverse covariance matrix in high-dimensional small sample settings, i.e. when the number of observations n is much smaller than the number of variables p: A well-known example is the shrinkage estimator by Shafer & Strimmer [3] which is defined as a weighted sum of the sample covariance matrix and a fixed (invertible) target matrix. This method can be considered as ‘‘agnostic’’ in the sense that it estimates the covariance matrix in a completely data-driven way, i.e. without prior knowledge. In reference to covariance selection, we called this first method ‘‘PACOSE’, standing for PArtial COrrelation SElection

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call