Abstract

In multicellular organisms, transcription factors (TFs) and microRNAs (miRNA) embody two largest families of molecules that modulate messenger RNA (mRNA) expressions through transcriptional and post-transcriptional regulations. While mRNA and microRNA expressions can be measured by microarray technique, the activities of transcription factors manifested by their protein expression are still difficult to observe, making it usually a complex problem to reconstruct a collaborative gene regulatory network (GRN) by TFs and miRNAs from expression data. In this paper, a novel Bayesian sparse non-negative factor regression (BSNFR) model is proposed for modeling the joint regulations of mRNAs by TFs and miRNAs and integration of multiple data types including gene expressions, microRNA expressions, TF targeted genes, and microRNA targets. Powered by a Gibbs sampling solution, BSNFR can infer both the TF/microRNA-mediated mRNA regulations and the unknown TF activities. Additionally, since BSNFR directly models the non-negative activities of TFs, it avoids the common problem of sign ambiguity with factor models and is capable of accurate prediction of the types (up or down) of regulations as well. BSNFR also includes a nonparametric Bayesian model for the latent factor activities, which enables the discovery of the clustering effects among samples due to (disease) subtypes. The proposed BSNFR model and the developed Gibbs sampling solution were validated on simulated systems and applied to real data of glioblastoma multiforme (GBM) patients from The Cancer Genome Atlas (TCGA). A GBM specific gene regulatory network by TFs and miRNAs was reconstructed. This GBM network includes 107 regulations recorded in the existing databases and 16 new regulations. Functional analysis suggests that the regulated genes are enriched in cell cycle and P53 pathways. In addition, BSNFR also identified 3 clusters among GBM patient samples, two of which demonstrates significant survival differences (p=0.004). Finally, the estimated TF activities imply that EGR-1 is significantly correlated with patient survivals (p=0.004) and may be used as a prognostic biomarker. The data and matlab code are available at: http://compgenomics.cbi.utsa.edu/BSNFR .

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.