Abstract

Transcriptomic profiling is an immensely powerful hypothesis generating tool. However, accurately predicting the transcription factors (TFs) and cofactors that drive transcriptomic differences between samples is challenging. A number of algorithms draw on ChIP-seq tracks to define TFs and cofactors behind gene changes. These approaches assign TFs and cofactors to genes via a binary designation of 'target', or 'non-target' followed by Fisher Exact Tests to assess enrichment of TFs and cofactors. ENCODE archives 2314 ChIP-seq tracks of 684 TFs and cofactors assayed across a 117 human cell lines under a multitude of growth and maintenance conditions. The algorithm presented herein, Mining Algorithm for GenetIc Controllers (MAGIC), uses ENCODE ChIP-seq data to look for statistical enrichment of TFs and cofactors in gene bodies and flanking regions in gene lists without an a priori binary classification of genes as targets or non-targets. When compared to other TF mining resources, MAGIC displayed favourable performance in predicting TFs and cofactors that drive gene changes in 4 settings: 1) A cell line expressing or lacking single TF, 2) Breast tumors divided along PAM50 designations 3) Whole brain samples from WT mice or mice lacking a single TF in a particular neuronal subtype 4) Single cell RNAseq analysis of neurons divided by Immediate Early Gene expression levels. In summary, MAGIC is a standalone application that produces meaningful predictions of TFs and cofactors in transcriptomic experiments.

Highlights

  • Key to the control of gene expression is the level of transcript in the cell

  • The datasets can contain tens of thousands of expression values per sample and the number of samples can be in the thousands such as breast cancer transcriptomes archived at The Cancer Genome Atlas (TCGA)[2]

  • Mining Algorithm for GenetIc Controllers (MAGIC) was tested on four transcriptome datasets: 1) MCF7(shCon_vs_shREST)

Read more

Summary

Introduction

Key to the control of gene expression is the level of transcript in the cell This level is controlled large part by Transcription factors (TFs) and cofactors. When comparing transcriptomes from two or more conditions such as normal to cancerous tissue, thousands of mRNA levels can change The changes reflect both alterations in tissue heterogeneity and alterations in transcriptional regulator function. We posit that in many cases, the majority of cellular transcriptome changes are driven by alterations in the function of a few Factors that coordinate gene programs. Identifying those driving Factors is a fundamental problem and therapeutic opportunity

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call