Connectivity in the Yeast Cell Cycle Transcription Network: Inferences from Neural Networks

Christopher E Hart,Barbara J Wold,Eric Mjolsness

doi:10.1371/journal.pcbi.0020169

Abstract

A current challenge is to develop computational approaches to infer gene network regulatory relationships based on multiple types of large-scale functional genomic data. We find that single-layer feed-forward artificial neural network (ANN) models can effectively discover gene network structure by integrating global in vivo protein:DNA interaction data (ChIP/Array) with genome-wide microarray RNA data. We test this on the yeast cell cycle transcription network, which is composed of several hundred genes with phase-specific RNA outputs. These ANNs were robust to noise in data and to a variety of perturbations. They reliably identified and ranked 10 of 12 known major cell cycle factors at the top of a set of 204, based on a sum-of-squared weights metric. Comparative analysis of motif occurrences among multiple yeast species independently confirmed relationships inferred from ANN weights analysis. ANN models can capitalize on properties of biological gene networks that other kinds of models do not. ANNs naturally take advantage of patterns of absence, as well as presence, of factor binding associated with specific expression output; they are easily subjected to in silico “mutation” to uncover biological redundancies; and they can use the full range of factor binding values. A prominent feature of cell cycle ANNs suggested an analogous property might exist in the biological network. This postulated that “network-local discrimination” occurs when regulatory connections (here between MBF and target genes) are explicitly disfavored in one network module (G2), relative to others and to the class of genes outside the mitotic network. If correct, this predicts that MBF motifs will be significantly depleted from the discriminated class and that the discrimination will persist through evolution. Analysis of distantly related Schizosaccharomyces pombe confirmed this, suggesting that network-local discrimination is real and complements well-known enrichment of MBF sites in G1 class genes.

Highlights

Hundreds of yeast RNAs are expressed in a cell cycle– dependent, oscillating manner
The primary goal of the artificial neural network (ANN) modeling is to infer the set of regulatory connections that underlies each of the cell cycle– phased expression groups
ANNs were trained to assign expression cluster membership for each gene based on 204 measured binding probabilities from chromatin IP (ChIP)/array experiments ([21])

Summary

Introduction

Hundreds of yeast RNAs are expressed in a cell cycle– dependent, oscillating manner. In both budding yeast and fission yeast, these RNAs cluster into four or five groups, each corresponding roughly to a phase of the cycle [1,2,3,4,5,6,7,8,9]. The complete composition and connectivity of the cell cycle transcription network is not yet known for any eukaryote, and many components may vary over long evolutionary distances [3,4,5,13], but some specific regulators (e.g., MBF of yeast and the related E2Fs of plants and animals) are paneukaryotic, as are some of their direct target genes (DNA polymerase, ribonucleotide reductase). One way to do this is to integrate multiple genome-wide data types that impinge on connection inference, including factor:DNA interaction data from chromatin IP (ChIP) studies, RNA expression patterns, and comparative genomic analysis. This is appealing partly because these assays are genome-comprehensive and hypothesis-independent, so they can, in principle, reveal regulatory relationships not detected by classical genetics.

Objectives

Methods

Results

Conclusion