Abstract
Selecting informative nodes over large-scale networks becomes increasingly important in many research areas. Most existing methods focus on the local network structure and incur heavy computational costs for the large-scale problem. In this work, we propose a novel prior model for Bayesian network marker selection in the generalized linear model (GLM) framework: the Thresholded Graph Laplacian Gaussian (TGLG) prior, which adopts the graph Laplacian matrix to characterize the conditional dependence between neighboring markers accounting for the global network structure. Under mild conditions, we show the proposed model enjoys the posterior consistency with a diverging number of edges and nodes in the network. We also develop a Metropolis-adjusted Langevin algorithm (MALA) for efficient posterior computation, which is scalable to large-scale networks. We illustrate the superiorities of the proposed method compared with existing alternatives via extensive simulation studies and an analysis of the breast cancer gene expression dataset in the Cancer Genome Atlas (TCGA).
Highlights
In biomedical research, complex biological systems are often modeled or represented as biological networks (Kitano, 2002)
To address limitations of existing methods, we propose a new prior model: the thresholded graph Laplacian Gaussian (TGLG) prior, to perform network marker selection over the large-scale network by thresholding a latent continuous variable attached to each node
Following settings in Li and Li (2008), Zhe et al (2013) and Kim et al (2013), we simulate small simple gene networks consisting of multiple subnetworks, where each subnetwork contains one transcription factor (TF) gene and 10 target genes that are connected to the TF gene; and two of the subnetworks are set as the true network markers
Summary
Complex biological systems are often modeled or represented as biological networks (Kitano, 2002). To address limitations of existing methods, we propose a new prior model: the thresholded graph Laplacian Gaussian (TGLG) prior, to perform network marker selection over the large-scale network by thresholding a latent continuous variable attached to each node. We propose to build the threshold priors using the graph Laplacian matrix, which has been used to capture the structure dependence between neighboring nodes (Li and Li, 2008; Zhe et al, 2013; Li and Li, 2010) Most of those frequentist methods directly specify the graph Laplacian matrix from the existing biological network.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have