Abstract

Disruption of gene regulation is known to play major roles in carcinogenesis and tumour progression. Here, we comprehensively characterize the mutational profiles of diverse transcription factor binding sites (TFBSs) across 1,574 completely sequenced cancer genomes encompassing 11 tumour types. We assess the relative rates and impact of the mutational burden at the binding sites of 81 transcription factors (TFs), by comparing the abundance and patterns of single base substitutions within putatively functional binding sites to control sites with matched sequence composition. There is a strong (1.43-fold) and significant excess of mutations at functional binding sites across TFs, and the mutations that accumulate in cancers are typically more disruptive than variants tolerated in extant human populations at the same sites. CTCF binding sites suffer an exceptionally high mutational load in cancer (3.31-fold excess) relative to control sites, and we demonstrate for the first time that this effect is seen in essentially all cancer types with sufficient data. The sub-set of CTCF sites involved in higher order chromatin structures has the highest mutational burden, suggesting a widespread breakdown of chromatin organization. However, we find no evidence for selection driving these distinctive patterns of mutation. The mutational load at CTCF-binding sites is substantially determined by replication timing and the mutational signature of the tumor in question, suggesting that selectively neutral processes underlie the unusual mutation patterns. Pervasive hyper-mutation within transcription factor binding sites rewires the regulatory landscape of the cancer genome, but it is dominated by mutational processes rather than selection.

Highlights

  • Most large-scale surveys of somatic mutation in cancer have focussed on protein-coding sequences, and catalogues of genes that carry recurrent mutations already number in the hundreds [1,2,3], but it has long been speculated that driver mutations are likely to exist in the 98% of the genome sequence outside protein-coding exons [4]

  • We study the patterns of mutations accumulating at short DNA segments bound by regulatory proteins across many cancer types and in the human population

  • Functional transcription factor binding sites (TFBSs) are enriched for mutations across transcription factors and cancers We compiled a total of 9,958,580 somatic single base substitutions across 1,574 tumour samples from 11 different tumour types; consistent with previous studies [2,5], there was a high degree of variation in substitution rates amongst tumour types (Table 1)

Read more

Summary

Introduction

Most large-scale surveys of somatic mutation in cancer have focussed on protein-coding sequences, and catalogues of genes that carry recurrent mutations already number in the hundreds [1,2,3], but it has long been speculated that driver mutations are likely to exist in the 98% of the genome sequence outside protein-coding exons [4]. Our view of transcriptional regulation in the human genome has changed radically as large consortia have profiled chromatin features across multiple cell types [7], including extensive catalogues of active regulatory elements [8]. The reliable detection of elevated mutation at particular sites requires careful comparisons with control sites, accounting for the features associated with the sites under scrutiny, such as nucleotide composition, fine scale chromatin accessibility and replication timing [11,13]. Some studies of mutation at regulatory sites have suffered from low sample sizes per cancer type but were still able to identify a number of recurrently mutated promoters [14], for example the telomerase reverse transcriptase (TERT) gene in melanomas [15]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call