Abstract
Long regulatory elements (LREs), such as CpG islands, polydA:dT tracts or AU-rich elements, are thought to play key roles in gene regulation but, as opposed to conventional binding sites of transcription factors, few methods have been proposed to formally and automatically characterize them. We present here a computational approach named DExTER (Domain Exploration To Explain gene Regulation) dedicated to the identification of candidate LREs (cLREs) and apply it to the analysis of the genomes of P. falciparum and other eukaryotes. Our analyses show that all tested genomes contain several cLREs that are somewhat conserved along evolution, and that gene expression can be predicted with surprising accuracy on the basis of these long regions only. Regulation by cLREs exhibits very different behaviours depending on species and conditions. In P. falciparum and other Apicomplexan organisms as well as in Dictyostelium discoideum, the process appears highly dynamic, with different cLREs involved at different phases of the life cycle. For multicellular organisms, the same cLREs are involved in all tissues, but a dynamic behavior is observed along embryonic development stages. In P. falciparum, whose genome is known to be strongly depleted of transcription factors, cLREs are predictive of expression with an accuracy above 70%, and our analyses show that they are associated with both transcriptional and post-transcriptional regulation signals. Moreover, we assessed the biological relevance of one LRE discovered by DExTER in P. falciparum using an in vivo reporter assay. The source code (python) of DExTER is available at https://gite.lirmm.fr/menichelli/DExTER.
Highlights
IntroductionGene expression is regulated at different levels and by different mechanisms in Eukaryotes
Our analyses show that all tested genomes contain several candidate long regulatory elements (LRE) (cLREs) that are somewhat conserved along evolution, and that gene expression can be predicted with surprising accuracy on the basis of these long regions only
In P. falciparum, whose genome is known to be strongly depleted of transcription factors, cLREs are predictive of expression with an accuracy above 70%, and our analyses show that they are associated with both transcriptional and post-transcriptional regulation signals
Summary
Gene expression is regulated at different levels and by different mechanisms in Eukaryotes. Main factors of the general transcription machinery are present in the Plasmodium genome, yet only a few specific TFs (mostly belonging to the apicomplexan AP2 TF family) have been identified and validated [2,3,4,5,6,7]. They constitute approximately 1% of all protein-coding genes [8, 9] compared to 3% in yeast or 6% in human. Several studies have shown that post-transcriptional regulation (mRNA degradation) and translational control mechanisms operate in this parasite (see for example [14,15,16,17])
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have