Abstract

Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptional regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.

Highlights

  • Evidence for important, including essential, cellular and organismal roles of long non-coding RNA (lncRNA) in mammalian systems began to emerge prior to the advent of high-throughput genome and transcriptome sequencing

  • We found that A/T-rich mono, diand tri-nucleotide patterns are enriched at the promoters of lncRNA genes, relative to the promoters of protein-coding genes (‘‘differentially enriched at lncRNA promoters’’) (Table S1)

  • We speculate that specific transcription factor (TF) may function as network nodes that accept directional edges from regulatory lncRNAs, and serve as network hubs that extend multiple new directional edges toward other lncRNA genes whose promoters contain their cognate transcription factor binding sites (TFBS)

Read more

Summary

Introduction

Evidence for important, including essential, cellular and organismal roles of lncRNA in mammalian systems began to emerge prior to the advent of high-throughput genome and transcriptome sequencing. These early examples included the demonstration that the lncRNA XIST [1] was necessary and sufficient for X-chromosome silencing, as well as the discovery of SRA [2], an lncRNA that directly regulates the estrogen receptor a, one of the nuclear hormone receptors. Many lncRNA transcripts, to mRNAs, are 59-capped, polyadenylated, frequently spliced with conventional GT-AG intron excision, and readily evident in cytoplasmic polyA + RNA preparations; thousands of lncRNAs have been discovered from cDNA libraries [5], abundant nuclear and polyAlncRNAs have been identified [6].

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call