Abstract

Signaling and regulatory pathways that guide gene expression have only been partially defined for most organisms. However, given the increasing number of microarray measurements, it may be possible to reconstruct such pathways and uncover missing connections directly from experimental data. Using a compendium of microarray gene expression data obtained from Escherichia coli, we constructed a series of Bayesian network models for the reactive oxygen species (ROS) pathway as defined by EcoCyc. A consensus Bayesian network model was generated using those networks sharing the top recovered score. This microarray-based network only partially agreed with the known ROS pathway curated from the literature and databases. A top network was then expanded to predict genes that could enhance the Bayesian network model using an algorithm we termed ‘BN+1’. This expansion procedure predicted many stress-related genes (e.g., dusB and uspE), and their possible interactions with other ROS pathway genes. A term enrichment method discovered that biofilm-associated microarray data usually contained high expression levels of both uspE and gadX. The predicted involvement of gene uspE in the ROS pathway and interactions between uspE and gadX were confirmed experimentally using E. coli reporter strains. Genes gadX and uspE showed a feedback relationship in regulating each other's expression. Both genes were verified to regulate biofilm formation through gene knockout experiments. These data suggest that the BN+1 expansion method can faithfully uncover hidden or unknown genes for a selected pathway with significant biological roles. The presently reported BN+1 expansion method is a generalized approach applicable to the characterization and expansion of other biological pathways and living systems.

Highlights

  • In this study, we explore how a biological pathway can be defined, and identify a set of methods to automatically learn a pathway from experimental data

  • We describe the Bayesian network pathways identified from gene expression data, and the expansions to each network as predicted using the Bayesian networks (BNs)+1 algorithm (Figure 1)

  • We addressed two questions: (1) Does a microarray-based Bayesian network reconstruction match with the known pathway from the literature and existing database? (2) Is a network expansion approach such as BN+1 useful in predicting new, biologically significant genes?

Read more

Summary

Introduction

We explore how a biological pathway can be defined, and identify a set of methods to automatically learn a pathway from experimental data. When an annotated pathway is used to analyze microarray gene expression data, the assumption is made that the ideal microarray derived network will be the same as that in the literature This assumption may not hold since many pathways are defined based on observed protein-protein and protein-DNA interactions, metabolic fluxes, and subsets of wellstudied genes. The selected pathway representation may be incomplete and not include relevant regulator or effector molecules, necessitating computational prediction and subsequent validation To address this issue, we introduce a method to systematically expand a pathway by identifying new genes that, from a gene expression perspective, better define the pathway itself

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call