Abstract

BackgroundHomeobox genes encode a diverse set of transcription factors implicated in a vast range of biological processes including, but not limited to, embryonic cell fate specification and patterning. Although numerous studies report expression of particular sets of homeobox genes, a systematic analysis of the tissue specificity of homeobox genes is lacking.ResultsHere we analyse publicly-available transcriptome data from human and mouse developmental stages, and adult human tissues, to identify groups of homeobox genes with similar expression patterns. We calculate expression profiles for 242 human and 278 mouse homeobox loci across a combination of 59 human and 12 mouse adult tissues, early and late developmental stages. This revealed 20 human homeobox genes with widespread expression, primarily from the TALE, CERS and ZF classes. Most homeobox genes, however, have greater tissue-specificity, allowing us to compile homeobox gene expression lists for neural tissues, immune tissues, reproductive and developmental samples, and for numerous organ systems. In mouse development, we propose four distinct phases of homeobox gene expression: oocyte to zygote; 2-cell; 4-cell to blastocyst; early to mid post-implantation. The final phase change is marked by expression of ANTP class genes. We also use these data to compare expression specificity between evolutionarily-based gene classes, revealing that ANTP, PRD, LIM and POU homeobox gene classes have highest tissue specificity while HNF, TALE, CUT and CERS are most widely expressed.ConclusionsThe homeobox genes comprise a large superclass and their expression patterns are correspondingly diverse, although in a broad sense related to an evolutionarily-based classification. The ubiquitous expression of some genes suggests roles in general cellular processes; in contrast, most human homeobox genes have greater tissue specificity and we compile useful homeobox datasets for particular tissues, organs and developmental stages. The identification of a set of eutherian-specific homeobox genes peaking from human 8-cell to morula stages suggests co-option of new genes to new developmental roles in evolution.Electronic supplementary materialThe online version of this article (doi:10.1186/s12861-016-0140-y) contains supplementary material, which is available to authorized users.

Highlights

  • Homeobox genes encode a diverse set of transcription factors implicated in a vast range of biological processes including, but not limited to, embryonic cell fate specification and patterning

  • We did not use published Fragments per kilobase per million sequencing reads (FPKM) data or Reads per kilobase per million sequencing reads (RPKM) data, but took publicly-available range of transcriptome sequencing (RNAseq) data files for each human tissue, organ sample or developmental stage, and remapped the raw sequence reads to human genome assembly NCBI GRCh38.p2

  • A clear pattern is that most homeobox genes have moderately specific expression patterns; by this we mean that most genes have one site of maximal expression, and few other tissues with high or moderate expression, with most tissues being negative or substantially lower

Read more

Summary

Introduction

Homeobox genes encode a diverse set of transcription factors implicated in a vast range of biological processes including, but not limited to, embryonic cell fate specification and patterning. Numerous studies report expression of particular sets of homeobox genes, a systematic analysis of the tissue specificity of homeobox genes is lacking. The large number of genes is mirrored by a vast range of reported expression sites and biological roles, such that few general statements can be made about homeobox gene function. The great majority of homeobox genes encode transcription factors, even this general statement might not be true for every gene since some homeodomains have reported roles in RNA-binding roles [4] or in modification of higher order chromatin structure [5]; a few vertebrate homeobox genes (CERS genes) even encode probable transmembrane proteins [6]. The scheme of Bürglin and Affolter [7] is broadly similar but erects 16 classes, dividing PRD and TALE into two and five classes respectively

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call