Space-log: a novel approach to inferring gene-gene net-works using SPACE model with log penalty.

Qian Vicky Wu,Li Hsu,Wei Sun

doi:10.12688/f1000research.26128.1

Qian Vicky Wu, Li Hsu + Show 1 more

Open Access

PDF Available

https://doi.org/10.12688/f1000research.26128.1

Copy DOI

Export

Save

Cite

Journal: F1000Research	Publication Date: Sep 21, 2020
License type: CC BY 4.0

Affiliation: Fred Hutch Cancer Center

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Gene expression data have been used to infer gene-gene networks (GGN) where an edge between two genes implies the conditional dependence of these two genes given all the other genes. Such gene-gene networks are of-ten referred to as gene regulatory networks since it may reveal expression regulation. Most of existing methods for identifying GGN employ penalized regression with L1(lasso), L2(ridge), or elastic net penalty, which spans the range of L1to L2penalty. However, for high dimensional gene expression data, a penalty that spans the range of L0and L1penalty, such as the log penalty, is often needed for variable selection consistency. Thus, we develop a novel method that em-ploys log penalty within the framework of an earlier network identification method space (Sparse PArtial Correlation Estimation), and implement it into a R package space-log. We show that the space-log is computationally efficient (source code implemented in C), and has good performance comparing with other methods, particularly for networks with hubs. Space-log is open source and available at GitHub,https://github.com/wuqian77/SpaceLog.

Highlights

The objective of this paper is to introduce a novel method that constructs gene-gene network (GGN) based on high dimensional gene expression data
We propose a new statistical method to estimate GGN by implementing the log penalty for the space approach, and we refer to our method as space-log
We evaluated the performance of the methods by the following metrics: number of false positives (FP), false negatives (FN), FP+FN, F1 score, FDR, true positive rate

Summary

Introduction

The objective of this paper is to introduce a novel method that constructs gene-gene network (GGN) based on high dimensional gene expression data. Graphical Lasso improves on neighborhood selection by providing a maximum likelihood estimate of the partial correlation matrix. The space method exploits the symmetry of partial correlation matrix to improve the estimation accuracy. It avoids potential conflicts in neighborhood selection, that is, Yi is selected as a neighbor of Yj but Yj is not selected as a neighbor of Yi, and one has to make a post-hoc decision for whether Yi and Yj are connected. Penalties in the range of L to L is often needed to improve the accuracy of variable selection for high-dimensional gene expression data[4]. We propose a new statistical method to estimate GGN by implementing the log penalty for the space approach, and we refer to our method as space-log

Objectives

Methods

Findings

Conclusion