Abstract
Decoding transcriptional programs governing transcriptomic diversity across human multiple tissues is a major challenge in bioinformatics. To address this problem, a number of computational methods have focused on cis-regulatory codes driving overexpression or underexpression in a single tissue as compared to others. On the other hand, we recently proposed a different approach to mine cis-regulatory codes: starting from gene sets sharing common cis-regulatory motifs, the method screens for expression modules based on expression coherence. However, both approaches seem to be insufficient to capture transcriptional programs that control gene expression in a subset of all samples. Especially, this limitation would be serious when analyzing multiple tissue data. To overcome this limitation, we developed a new module discovery method termed BEEM (Biclusering-based Extraction of Expression Modules) in order to discover expression modules that are functional in a subset of tissues. We showed that, when applied to expression profiles of human multiple tissues, BEEM finds expression modules missed by two existing approaches that are based on the coherent expression and the single tissue-specific differential expression. From the BEEM results, we obtained new insights into transcriptional programs controlling transcriptomic diversity across various types of tissues. This study introduces BEEM as a powerful tool for decoding regulatory programs from a compendium of gene expression profiles.
Highlights
Predicting cis-regulatory codes governing transcriptional programs in a specific type of cells has been intensively investigated by combining microarray gene expression data with cis-regulatory sequences or related information like ChIP-chip experiments
Our results suggest that BEEM successfully captures sample subgroup-specific expression modules, while it shows good performance to some degree for coherent expression modules, which are most efficiently captured by EEM
We found that BEEM and EEM produce relatively similar results, but their performances seem to be different depending on heterogeneity of input transcriptome data: BEEM works better for analyzing more heterogeneous data like the multiple tissue data set
Summary
Predicting cis-regulatory codes governing transcriptional programs in a specific type of cells has been intensively investigated by combining microarray gene expression data with cis-regulatory sequences or related information like ChIP-chip experiments. Several attempts have been done for identifying tissuespecific cis-regulatory codes by applying these methods to microarray data of human multiple tissues in order to understand their diversity [1,2,3,4,5] Since these methods only consider comparing overexpression and underexpression in a single tissue with those in the other tissues, single-tissue specific cis-regulatory codes could only be found; cis-regulatory codes existing across several tissues were possibly failed to be discovered. Since EEM assumes that module genes, i.e., genes belonging to the same expression module, behave across all samples, EEM potentially fails to identify an expression module whose module genes exhibit coherent expression patterns over only a subset of samples, i.e., sample subgroup-specific expression module This problem should be serious when analyzing a diverse gene expression data set like a multiple tissue data set
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.