Abstract

Changes in gene regulation may be important in evolution. However, the evolutionary properties of regulatory mutations are currently poorly understood. This is partly the result of an incomplete annotation of functional regulatory DNA in many species. For example, transcription factor binding sites (TFBSs), a major component of eukaryotic regulatory architecture, are typically short, degenerate, and therefore difficult to differentiate from randomly occurring, nonfunctional sequences. Furthermore, although sites such as TFBSs can be computationally predicted using evolutionary conservation as a criterion, estimates of the true level of selective constraint (defined as the fraction of strongly deleterious mutations occurring at a locus) in regulatory regions will, by definition, be upwardly biased in datasets that are a priori evolutionarily conserved. Here we investigate the fitness effects of regulatory mutations using two complementary datasets of human TFBSs that are likely to be relatively free of ascertainment bias with respect to evolutionary conservation but, importantly, are supported by experimental data. The first is a collection of almost >2,100 human TFBSs drawn from the literature in the TRANSFAC database, and the second is derived from several recent high-throughput chromatin immunoprecipitation coupled with genomic microarray (ChIP-chip) analyses. We also define a set of putative cis-regulatory modules (pCRMs) by spatially clustering multiple TFBSs that regulate the same gene. We find that a relatively high proportion (∼37%) of mutations at TFBSs are strongly deleterious, similar to that at a 2-fold degenerate protein-coding site. However, constraint is significantly reduced in human and chimpanzee pCRMS and ChIP-chip sequences, relative to macaques. We estimate that the fraction of regulatory mutations that have been driven to fixation by positive selection in humans is not significantly different from zero. We also find that the level of selective constraint in our TFBSs, pCRMs, and ChIP-chip sequences is negatively correlated with the expression breadth of the regulated gene, whereas the opposite relationship holds at that gene's nonsynonymous and synonymous sites. Finally, we find that the rate of protein evolution in a transcription factor appears to be positively correlated with the breadth of expression of the gene it regulates. Our study suggests that strongly deleterious regulatory mutations are considerably more likely (1.6-fold) to occur in tissue-specific than in housekeeping genes, implying that there is a fitness cost to increasing “complexity” of gene expression.

Highlights

  • Changes in gene regulation are likely to play an important role in evolution [1,2]

  • Gene expression is primarily controlled by sites, such as transcription factor binding sites (TFBSs), located in the noncoding region of the genome

  • We collected a dataset of TFBSs derived from the experimental biology literature and recent high-throughput studies to estimate the proportions of new mutations in TFBSs that have strongly deleterious and strongly beneficial effects upon organism fitness

Read more

Summary

Introduction

Changes in gene regulation are likely to play an important role in evolution [1,2]. compared to protein-coding sequences, the fitness effects of regulatory mutations remain poorly understood. The relationship between changes in gene regulatory regions and the expression phenotype of the regulated gene are unclear. Both of these issues are partly a result of poor annotation of the sites that control gene regulation, the vast majority of which are likely to be noncoding. Transcription factor binding sites (TFBSs), a major component of regulatory architecture, are small (6–15 bp), laborious to identify experimentally and potentially degenerate. Due to their small size, genuine TFBS are difficult to differentiate from similar, randomly-occurring sequences that are present in large numbers in mammalian genomes

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call