Abstract

Homeobox genes are a group of genes coding for transcription factors with a DNA-binding helix-turn-helix structure called a homeodomain and which play a crucial role in pattern formation during embryogenesis. Many homeobox genes are located in clusters and some of these, most notably the HOX genes, are known to have antisense or opposite strand long non-coding RNA (lncRNA) genes that play a regulatory role. Because automated annotation of both gene clusters and non-coding genes is fraught with difficulty (over-prediction, under-prediction, inaccurate transcript structures), we set out to manually annotate all homeobox genes in the mouse and human genomes. This includes all supported splice variants, pseudogenes and both antisense and flanking lncRNAs. One of the areas where manual annotation has a significant advantage is the annotation of duplicated gene clusters. After comprehensive annotation of all homeobox genes and their antisense genes in human and in mouse, we found some discrepancies with the current gene set in RefSeq regarding exact gene structures and coding versus pseudogene locus biotype. We also identified previously un-annotated pseudogenes in the DUX, Rhox and Obox gene clusters, which helped us re-evaluate and update the gene nomenclature in these regions. We found that human homeobox genes are enriched in antisense lncRNA loci, some of which are known to play a role in gene or gene cluster regulation, compared to their mouse orthologues. Of the annotated set of 241 human protein-coding homeobox genes, 98 have an antisense locus (41%) while of the 277 orthologous mouse genes, only 62 protein coding gene have an antisense locus (22%), based on publicly available transcriptional evidence.

Highlights

  • Homeobox genes code for transcription factors that have the homeodomain, a DNA-binding helix-turn-helix structure encoded by the homeobox, as the defining feature [1]

  • Data presented in the paper showed that the human genome is enriched in neighboring long non-coding RNA (lncRNA) compared with the mouse genome and in some genomic regions a human protein coding gene had an antisense lncRNA where the mouse orthologue had an opposite strand lincRNA and vice versa. This finding implies that the antisense nature of non-coding RNAs is not as crucial as the simple presence of opposite strand lncRNAs in the vicinity of a coding gene or gene cluster. This observation is in line with emerging experimental data showing a more complex functionality of lncRNAs than that which could be drawn from their genomic position relative to coding genes [20]

  • HomeoDB lists four human CPHX genes, but we suggest that the two CPHXR genes on chromosome 10 and/ or the DUXBLR they are flanking are the two newly annotated DUX pseudogenes presented here (Figure 3A)

Read more

Summary

Introduction

Homeobox genes code for transcription factors that have the homeodomain, a DNA-binding helix-turn-helix structure encoded by the homeobox, as the defining feature [1]. Data presented in the paper showed that the human genome is enriched in neighboring lncRNAs compared with the mouse genome and in some genomic regions a human protein coding gene had an antisense lncRNA where the mouse orthologue had an opposite strand lincRNA and vice versa This finding implies that the antisense nature of non-coding RNAs (as currently defined) is not as crucial as the simple presence of opposite strand lncRNAs in the vicinity of a coding gene or gene cluster. This observation is in line with emerging experimental data showing a more complex functionality of lncRNAs than that which could be drawn from their genomic position relative to coding genes [20]. We present an updated analysis of the homeobox gene containing regions in human and mouse and highlight the similarities and differences of architecture within each genome and give insights into their evolution

Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call