Abstract

Predicting responsible transcription regulators on the basis of transcriptome data is one of the most promising computational approaches to understanding cellular processes and characteristics. Here, we present a novel method employing vast amounts of chromatin immunoprecipitation (ChIP) experimental data to address this issue. Global high-throughput ChIP data was collected to construct a comprehensive database, containing 8 578 738 binding interactions of 454 transcription regulators. To incorporate information about heterogeneous frequencies of transcription factor (TF)-binding events, we developed a flexible framework for gene set analysis employing the weighted t-test procedure, namely weighted parametric gene set analysis (wPGSA). Using transcriptome data as an input, wPGSA predicts the activities of transcription regulators responsible for observed gene expression. Validation of wPGSA with published transcriptome data, including that from over-expressed TFs, showed that the method can predict activities of various TFs, regardless of cell type and conditions, with results totally consistent with biological observations. We also applied wPGSA to other published transcriptome data and identified potential key regulators of cell reprogramming and influenza virus pathogenesis, generating compelling hypotheses regarding underlying regulatory mechanisms. This flexible framework will contribute to uncovering the dynamic and robust architectures of biological regulation, by incorporating high-throughput experimental data in the form of weights.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call