Abstract
BackgroundAlthough the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm).ResultsFor all analyzed species the observed number of overlapping genes is always lower than expected assuming functional neutrality, suggesting that gene overlap is negatively selected. The comparison to the random distribution also shows that retained overlaps do not exhibit random features: antiparallel overlaps are significantly enriched, while overlaps lying on the same strand and those involving coding sequences are highly underrepresented. We confirm that overlap is mostly species-specific and provide evidence that it frequently originates through the acquisition of terminal, non-coding exons. Finally, we show that overlapping genes tend to be significantly co-expressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression.ConclusionOur data suggest that overlap between protein-coding genes is selected against in Metazoa. However, when retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes. The tendency of overlaps to involve non-coding regions of the genes leads to the speculation that the advantages achieved by an overlapping arrangement may be optimized by evolving regulatory non-coding transcripts.
Highlights
The overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear
We show that overlapping genes tend to be significantly coexpressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression
When retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes
Summary
The overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. The majority of detected overlaps occurs between genes transcribed from opposite strands of the same genomic locus and often involves non-coding RNAs [6,9,10,11,12,13,14]. Overlap is estimated to involve around 10% of protein-coding genes [13,25], raising to 20%–60% when non-coding RNAs are included [6,8,9,10,12,14,26,27] Despite their abundance, the origin and evolution of overlapping genes in eukaryotes remain unclear, and different comparative studies have often led to discordant results [6,12,13,14,25]. To evaluate the possible relationship between overlap and gene expression, we analyzed the expression of our set of overlapping genes in a human breast cancer cDNA library derived by 454 deep sequencing
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have