Network-based analysis of omics data: the LEAN method.

Frederik Gwinner,Elisabeth Tournier-Lasserve,Cécile Cardoso,Claire Vandiedonck,Benno Schwikowski,Olivier D Christophe,Johann Beghain,Iryna Nikolayeva,Oriol Guitart-Pla,Minh Arnould,Gwénola Boulday,Cécile V Denis

doi:10.1093/bioinformatics/btw676

Abstract

MotivationMost computational approaches for the analysis of omics data in the context of interaction networks have very long running times, provide single or partial, often heuristic, solutions and/or contain user-tuneable parameters.ResultsWe introduce local enrichment analysis (LEAN) for the identification of dysregulated subnetworks from genome-wide omics datasets. By substituting the common subnetwork model with a simpler local subnetwork model, LEAN allows exact, parameter-free, efficient and exhaustive identification of local subnetworks that are statistically dysregulated, and directly implicates single genes for follow-up experiments.Evaluation on simulated and biological data suggests that LEAN generally detects dysregulated subnetworks better, and reflects biological similarity between experiments more clearly than standard approaches. A strong signal for the local subnetwork around Von Willebrand Factor (VWF), a gene which showed no change on the mRNA level, was identified by LEAN in transcriptome data in the context of the genetic disease Cerebral Cavernous Malformations (CCM). This signal was experimentally found to correspond to an unexpected strong cellular effect on the VWF protein. LEAN can be used to pinpoint statistically significant local subnetworks in any genome-scale dataset.Availability and ImplementationThe R-package LEANR implementing LEAN is supplied as supplementary material and available on CRAN (https://cran.r-project.org).Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

The organization of the molecular machinery of cells is thought to be inherently modular (Alon, 2003; Hartwell et al, 1999)
A strong signal for the local subnetwork around Von Willebrand Factor (VWF), a gene which showed no change on the mRNA level, was identified by local enrichment analysis (LEAN) in transcriptome data in the context of the genetic disease Cerebral Cavernous Malformations (CCM)
We verified that the graph radius of the simulated subnetworks was substantially larger than 1 to ensure that our evaluation dataset is not biased towards overly compact subnetworks, which would confer an advantage to the local subnetwork model

Summary

Introduction

The organization of the molecular machinery of cells is thought to be inherently modular (Alon, 2003; Hartwell et al, 1999). When studying large-scale datasets, once gene-level scores have been computed, a common step is to aggregate them to the level of gene sets. Pathway analysis focuses on enrichment in annotated gene sets, such as genes involved in a common biological process. For the remainder of this article, we will use the terms pathway and gene set interchangeably in the above sense of a set of genes sharing a common functional annotation. Significant scores for a particular pathway suggest specific higher-level functional interpretations of the dataset (Khatri et al, 2012). The MSigDB database (Subramanian et al, 2005) for example includes pathways corresponding to genes that share functional annotations, chromosomal locations or cisregulatory motifs, or are part of specific molecular (e.g. oncogenic or immunologic) signatures

Methods

Results

Discussion

Conclusion