Abstract

BackgroundInflammatory bowel disease (IBD) consists of two main disease-subtypes, Crohn’s disease (CD) and ulcerative colitis (UC); these subtypes share overlapping genetic and clinical features. Genome-wide microarray data enable unbiased documentation of alterations in gene expression that may be disease-specific. As genetic diseases are believed to be caused by genetic alterations affecting the function of signalling pathways, module-centric optimisation algorithms, whose aim is to identify sub-networks that are dys-regulated in disease, are emerging as promising approaches.ResultsIn order to account for the topological structure of molecular interaction networks, we developed an optimisation algorithm that integrates databases of known molecular interactions with gene expression data; such integration enables identification of differentially regulated network modules. We verified the performance of our algorithm by testing it on simulated networks; we then applied the same method to study experimental data derived from microarray analysis of CD and UC biopsies and human interactome databases. This analysis allowed the extraction of dys-regulated subnetworks under different experimental conditions (inflamed and uninflamed tissues in CD and UC). Optimisation was performed to highlight differentially expressed network modules that may be common or specific to the disease subtype.ConclusionsWe show that the selected subnetworks include genes and pathways of known relevance for IBD; in particular, the solutions found highlight cross-talk among enriched pathways, mainly the JAK/STAT signalling pathway and the EGF receptor signalling pathway. In addition, integration of gene expression with molecular interaction data highlights nodes that, although not being differentially expressed, interact with differentially expressed nodes and are part of pathways that are relevant to IBD. The method proposed here may help identifying dys-regulated sub-networks that are common in different diseases and sub-networks whose dys-regulation is specific to a particular disease.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-0886-z) contains supplementary material, which is available to authorized users.

Highlights

  • Inflammatory bowel disease (IBD) consists of two main disease-subtypes, Crohn’s disease (CD) and ulcerative colitis (UC); these subtypes share overlapping genetic and clinical features

  • The algorithm input consists of a network of known protein interactions and of the z-scores calculated from the p-values of two lists of differentially expressed genes; the latter are derived from biopsies of patients affected by CD against controls and biopsies of patients affected by UC against controls

  • Data pre-processing Experimental data Microarray data were downloaded from the NCBI Gene Expression Omnibus website [16] and normalised using the GEO2R R script [17]. These data were obtained by using high-density oligonucleotide microarrays that interrogate 10,000 fulllength genes to compare gene expression patterns in CD, UC and a third non-IBD colitis group

Read more

Summary

Introduction

Inflammatory bowel disease (IBD) consists of two main disease-subtypes, Crohn’s disease (CD) and ulcerative colitis (UC); these subtypes share overlapping genetic and clinical features. Genome-wide microarray data enable unbiased documentation of alterations in gene expression that may be disease-specific. As genetic diseases are believed to be caused by genetic alterations affecting the function of signalling pathways, module-centric optimisation algorithms, whose aim is to identify sub-networks that are dys-regulated in disease, are emerging as promising approaches. Inflammatory bowel disease (IBD), including ulcerative colitis (UC) and Crohn’s disease (CD), arises from a breakdown in the normally symbiotic relationship between intestinal microflora and mucosa in individuals with a given genetic background. Muraro and Simmons BMC Bioinformatics (2016) 17:42 with databases of known molecular interactions may provide several advantages in terms of uncovering functional pathways driving disease specific expression signatures, identification of ‘hidden nodes’ that, not being differentially expressed, may play an important role in connecting differentially expressed genes, and increased statistical robustness since differential expression is evaluated at a network level rather than for each gene individually [2, 3, 6]. In the context of disease networks, network modules are typically defined as subsets of highly interconnected genes showing a significant overall differential expression in disease as compared with control cells [2]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call