Abstract
BackgroundSequence-specific DNA-binding proteins, with their paramount importance in the regulation of expression of the genetic material, are encoded by approximately 5% of the genes in an animal’s genome. But it is unclear to what extent alternative transcripts from these genes may further increase the complexity of the transcription factor complement.ResultsOf the 938 potential C. elegans transcription factor genes, 197 were annotated in WormBase as encoding at least two distinct isoforms. Evaluation of prior evidence identified, with different levels of confidence, 50 genes with alternative transcript starts, 23 with alternative transcript ends, 35 with alternative splicing and 34 with alternative transcripts generated by a combination of mechanisms, leaving 55 that were discounted. Expression patterns were determined for transcripts for a sample of 29 transcription factor genes, concentrating on those with alternative transcript starts for which the evidence was strongest. Seamless fosmid recombineering was used to generate reporter gene fusions with minimal modification to assay expression of specific transcripts while maintaining the broad genomic DNA context and alternative transcript production. Alternative transcription factor gene transcripts were typically expressed with identical or substantially overlapping distributions rather than in distinct domains.ConclusionsIncreasingly sensitive sequencing technologies will reveal rare transcripts but many of these are clearly non-productive. The majority of the transcription factor gene alternative transcripts that are productive may represent tolerable noise rather than encoding functionally distinct isoforms.
Highlights
Sequence-specific DNA-binding proteins, with their paramount importance in the regulation of expression of the genetic material, are encoded by approximately 5% of the genes in an animal’s genome
Assessment of prior evidence for alternative transcription factor isoforms Before any reporter gene fusions were generated we first invested some time in compiling a list of C. elegans transcription factor genes likely to encode multiple isoforms
Since the compendium of 934 C. elegans transcription factor genes used as the starting point was originally published [4] a few genes have been added or removed [5]
Summary
Sequence-specific DNA-binding proteins, with their paramount importance in the regulation of expression of the genetic material, are encoded by approximately 5% of the genes in an animal’s genome. The large proportion of the genome devoted to encoding transcription factors increases as the size of the genome increases, highlighting their significance to biological complexity, and is around 5% for metazoans [1]. An extensive bioinformatics study, based on gene ontology and on DNA sequence predicted to encode known DNA-binding domains, identified 938 potential transcription factor genes in the C. elegans genome [4,5]. The number of potential alternative transcripts for transcription factor genes is likely to increase still further, how many of these transcripts encode functional distinct transcription factor isoforms or are noise in the system is not yet clear
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.