Abstract

Mammalian genomes harbor a larger than expected number of complex loci, in which multiple genes are coupled by shared transcribed regions in antisense orientation and/or by bidirectional core promoters. To determine the incidence, functional significance, and evolutionary context of mammalian complex loci, we identified and characterized 5,248 cis–antisense pairs, 1,638 bidirectional promoters, and 1,153 chains of multiple cis–antisense and/or bidirectionally promoted pairs from 36,606 mouse transcriptional units (TUs), along with 6,141 cis–antisense pairs, 2,113 bidirectional promoters, and 1,480 chains from 42,887 human TUs. In both human and mouse, 25% of TUs resided in cis–antisense pairs, only 17% of which were conserved between the two organisms, indicating frequent species specificity of antisense gene arrangements. A sampling approach indicated that over 40% of all TUs might actually be in cis–antisense pairs, and that only a minority of these arrangements are likely to be conserved between human and mouse. Bidirectional promoters were characterized by variable transcriptional start sites and an identifiable midpoint at which overall sequence composition changed strand and the direction of transcriptional initiation switched. In microarray data covering a wide range of mouse tissues, genes in cis–antisense and bidirectionally promoted arrangement showed a higher probability of being coordinately expressed than random pairs of genes. In a case study on homeotic loci, we observed extensive transcription of nonconserved sequences on the noncoding strand, implying that the presence rather than the sequence of these transcripts is of functional importance. Complex loci are ubiquitous, host numerous nonconserved gene structures and lineage-specific exonification events, and may have a cis-regulatory impact on the member genes.

Highlights

  • Several recent reports indicate that the transcriptional complexity of mammalian genomes has been significantly underestimated

  • As part of our effort to characterize complex loci, we report to our knowledge the most comprehensive list to date: 6,141 cis–antisense pairs in the human genome and 5,248 in the mouse

  • While methodological differences in redundancy reduction and clustering preclude direct comparison to earlier estimates based on cDNA and expressed sequence tag (EST) data [10,12,21], the observed 2-fold difference in cis–antisense pair counts compared to these previous reports and widespread chaining of bidirectional transcription indicate that the earlier studies underestimated the prevalence and complexity of antisense transcription in human

Read more

Summary

Introduction

Several recent reports indicate that the transcriptional complexity of mammalian genomes has been significantly underestimated. Large-scale sequencing of full-length transcripts, expressed sequence tags (ESTs), and shorter tags [1] and transcriptional maps constructed by the use of tiling arrays [2,3,4,5] demonstrate that human and mouse genomes contain an abundance of complex loci with overlapping transcription on the two DNA strands. Individual complex loci have been described in detail [6,7,8], a global description of the general properties of gene arrangements within such complex loci is lacking.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call