Abstract

Eukaryotic gene expression is regulated by transcription factors (TFs) binding to promoter as well as distal enhancers. TFs recognize short, but specific binding sites (TFBSs) that are located within the promoter and enhancer regions. Functionally relevant TFBSs are often highly conserved during evolution leaving a strong phylogenetic signal. While multiple sequence alignment (MSA) is a potent tool to detect the phylogenetic signal, the current MSA implementations are optimized to align the maximum number of identical nucleotides. This approach might result in the omission of conserved motifs that contain interchangeable nucleotides such as the ETS motif (IUPAC code: GGAW). Here, we introduce ConBind, a novel method to enhance alignment of short motifs, even if their mutual sequence similarity is only partial. ConBind improves the identification of conserved TFBSs by improving the alignment accuracy of TFBS families within orthologous DNA sequences. Functional validation of the Gfi1b + 13 enhancer reveals that ConBind identifies additional functionally important ETS binding sites that were missed by all other tested alignment tools. In addition to the analysis of known regulatory regions, our web tool is useful for the analysis of TFBSs on so far unknown DNA regions identified through ChIP-sequencing.

Highlights

  • Accurate binding of transcription factors (TFs) to DNA is necessary for the normal functioning of all cell types

  • TFs bind to short (5–10 bp) DNA sequences known as DNA binding motifs or TF binding sites (TFBSs), and through interactions with the basic transcriptional machinery they control whether a gene is turned on or off

  • Assuming that functional TFBSs are generally more conserved than non-functional binding sites, the prediction of functional TFBSs relies on the proficiency of multiple sequence alignment (MSA) methods to correctly identify conserved TFBSs

Read more

Summary

Introduction

Accurate binding of transcription factors (TFs) to DNA is necessary for the normal functioning of all cell types. Functional regulatory DNA elements such as promoters and enhancers are often evolutionarily conserved; comparative DNA sequence analysis has long been recognized as a powerful approach to both locate candidate regulatory regions, and to pinpoint critical binding sites within such regions [1,2,3]. Anguita et al [6] identified a number of conserved non-coding elements (CNEs) containing multiple erythroid specific TFBSs through a multiple species sequence comparison approach. Three of these CNEs could be validated as haematopoietic enhancers in transgenic mouse assays [7], highlighting the importance of comparative DNA sequence analysis

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.