Abstract

The Gaussian graphical model is a widely used tool for learning gene regulatory networks with high-dimensional gene expression data. Most existing methods for Gaussian graphical models assume that the data are homogeneous, i.e., all samples are drawn from a single Gaussian distribution. However, for many real problems, the data are heterogeneous, which may contain some subgroups or come from different resources. This paper proposes to model the heterogeneous data using a mixture Gaussian graphical model, and apply the imputation-consistency algorithm, combining with the ψ-learning algorithm, to estimate the parameters of the mixture model and cluster the samples to different subgroups. An integrated Gaussian graphical network is learned across the subgroups along with the iterations of the imputation-consistency algorithm. The proposed method is compared with an existing method for learning mixture Gaussian graphical models as well as a few other methods developed for homogeneous data, such as graphical Lasso, nodewise regression, and ψ-learning. The numerical results indicate superiority of the proposed method in all aspects of parameter estimation, cluster identification, and network construction. The numerical results also indicate generality of the proposed method: it can be applied to homogeneous data without significant harms. The accompanied R package GGMM is available at https://cran.r-project.org.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call