GPLEXUS: enabling genome-scale gene association network reconstruction and analysis for very large-scale expression data

Jun Li,Hairong Wei,Patrick Xuechun Zhao,Tingsong Liu

doi:10.1093/nar/gkt983

Jun Li, Hairong Wei + Show 2 more

Open Access

https://doi.org/10.1093/nar/gkt983

Copy DOI

Abstract

The accurate construction and interpretation of gene association networks (GANs) is challenging, but crucial, to the understanding of gene function, interaction and cellular behavior at the genome level. Most current state-of-the-art computational methods for genome-wide GAN reconstruction require high-performance computational resources. However, even high-performance computing cannot fully address the complexity involved with constructing GANs from very large-scale expression profile datasets, especially for the organisms with medium to large size of genomes, such as those of most plant species. Here, we present a new approach, GPLEXUS (http://plantgrn.noble.org/GPLEXUS/), which integrates a series of novel algorithms in a parallel-computing environment to construct and analyze genome-wide GANs. GPLEXUS adopts an ultra-fast estimation for pairwise mutual information computing that is similar in accuracy and sensitivity to the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE) method and runs ∼1000 times faster. GPLEXUS integrates Markov Clustering Algorithm to effectively identify functional subnetworks. Furthermore, GPLEXUS includes a novel ‘condition-removing’ method to identify the major experimental conditions in which each subnetwork operates from very large-scale gene expression datasets across several experimental conditions, which allows users to annotate the various subnetworks with experiment-specific conditions. We demonstrate GPLEXUS’s capabilities by construing global GANs and analyzing subnetworks related to defense against biotic and abiotic stress, cell cycle growth and division in Arabidopsis thaliana.

Highlights

The availability of terabyte- and petabyte-sized gene expression datasets in public repositories [1,2] has inspired scientists to use genome-wide reverse genetic approaches to reconstruct gene networks and decipher the interaction between genes
Our results show that the Spearman correlation-based transformation method that is implemented in GPLEXUS has a significantly reduced runtime compared with the original Accurate Cellular Networks (ARACNE) method and B-spline-based mutual information (MI) estimation method
It was computationally infeasible to construct global gene association networks (GANs) using large-scale genomic datasets from plant species with small genomes, such as A. thaliana and G. max, with the original ARACNE method on a typical server (DELL PowerEdge R815 Server equipped with four 8-core CPUs and 128-GB RAM) without any optimization

Summary

Introduction

The availability of terabyte- and petabyte-sized gene expression datasets in public repositories [1,2] has inspired scientists to use genome-wide reverse genetic approaches to reconstruct gene networks and decipher the interaction between genes. One problem that is inherent in this co-expression network method is its high false-positive prediction rate, which is due to its inability to distinguish direct gene interactions from large number of indirect interactions. Other methods, such as the Bayesian Network [7] and Gaussian Graphics Model (GGM) [9], can infer the local network structure with high precision [10], but cannot handle genome-wide network construction due to the increased computational complexity that arises from the large number of gene variables [10]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nucleic Acids Research	Publication Date: Oct 30, 2013
Citations: 9	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

GPLEXUS: enabling genome-scale gene association network reconstruction and analysis for very large-scale expression data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nucleic Acids Research

Lead the way for us

Similar Papers

DRAGON and DRAGON View: Information Annotation and Visualization Tools for Large‐Scale Expression Data
Christopher M.L.S Bouton ... Jonathan Pevsner
Current Protocols in Bioinformatics | VOL. 2
Christopher M.L.S Bouton, et. al.Christopher M.L.S Bouton ... Jonathan Pevsner
01 Jun 2003
Current Protocols in Bioinformatics | VOL. 2

Concordant integrative gene set enrichment analysis of multiple large-scale two-sample expression data sets
Yinglei Lai ... Timothy A Mccaffrey
BMC Genomics | VOL. 15
Yinglei Lai, et. al.Yinglei Lai ... Timothy A Mccaffrey
01 Jan 2014
BMC Genomics | VOL. 15

Mining for low-nitrogen tolerance genes by integrating meta-analysis and large-scale gene expression data from maize
Bowen Luo ... Ling Wu
Euphytica | VOL. 206
Bowen Luo, et. al.Bowen Luo ... Ling Wu
06 Jun 2015
Euphytica | VOL. 206

Expression Analysis and Genome Annotations with RNA Sequencing
Masaaki Kobayashi ... Hajime Ohyanagi
-
Masaaki Kobayashi, et. al.Masaaki Kobayashi ... Hajime Ohyanagi
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GPLEXUS: enabling genome-scale gene association network reconstruction and analysis for very large-scale expression data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nucleic Acids Research