Abstract

BackgroundThe binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution.ResultsHere we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms.ConclusionAs has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA.

Highlights

  • The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs

  • Unlike protein sequences, where the relationship of the amino acid sequence to the functional constraint is often difficult to discern, in the case of transcription factor binding sites, we suggest that the evolutionary constraints can be interpreted directly with respect to the physical constraints imposed by the DNA-binding protein

  • Characterized binding sites show fewer substitutions than background DNA We first sought to verify that functional non-coding regions evolve more slowly than 'background sequences.' To do so, we selected several transcription factors for which there were multiple experimentally validated binding sites in the S. cerevisiae genome listed in the Promoter database of Saccharomyces cerevisiae (SCPD[20]), and compared the rate of evolution within these binding sites to that of the promoter regions in which they were found

Read more

Summary

Introduction

The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. The binding specifities of transcription factors are typically represented as consensus sequences or position weight matrices [1] that summarize their position-specific sequence preferences. In some cases, such 'motif' models of transcription factor binding sites can be inferred from genome sequences using computational methods [2,3,4,5,6,7]. Several methods [8,9,10,11,12] have been developed to identify conserved non-coding sequences that, when tested, often function as regulatory sequences in vivo (reviewed in [13])

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call