Abstract

Large sets of whole cancer genomes make it possible to study mutation hotspots genome-wide. Here we detect, categorize, and characterize site-specific hotspots using 2279 whole cancer genomes from the Pan-Cancer Analysis of Whole Genomes project and provide a resource of annotated hotspots genome-wide. We investigate the excess of hotspots in both protein-coding and gene regulatory regions and develop measures of positive selection and functional impact for individual hotspots. Using cancer allele fractions, expression aberrations, mutational signatures, and a variety of genomic features, such as potential gain or loss of transcription factor binding sites, we annotate and prioritize all highly mutated hotspots. Genome-wide we find more high-frequency SNV and indel hotspots than expected given mutational background models. Protein-coding regions are generally enriched for SNV hotspots compared to other regions. Gene regulatory hotspots show enrichment of potential same-patient second-hit missense mutations, consistent with enrichment of hotspot driver mutations compared to singletons. For protein-coding regions, splice-sites, promoters, and enhancers, we see an excess of hotspots associated with cancer genes. Interestingly, missense hotspot mutations in tumor suppressors are associated with elevated expression, suggesting localized amino-acid changes with functional impact. For individual non-coding hotspots, only a small number show clear signs of positive selection, including known sites in the TERT promoter and the 5’ UTR of TP53. Most of the new candidates have few mutations and limited driver evidence. However, a hotspot in an enhancer of the oncogene POU2AF1, which may create a transcription factor binding site, presents multiple lines of driver-consistent evidence.

Highlights

  • Mutations accumulate in human genomes throughout life

  • Candidate driver hotspot in POU2AF1 enhancer In multiple of our individual analyses, the findings suggest that an enhancer hotspot associated with the oncogene POU2AF1 may be under positive selection: the hotspot come up as a top result in the TFBS analysis; it contributes to the enrichment of cancer genes among enhancer hotspots; it has an above-median ΔCAF z-score; and based on contributions of mutational signatures, we do not believe that background processes caused these mutations

  • We identified more than 700,000 site-specific hotspots, which is more than expected by chance under the given models for the background mutation rate

Read more

Summary

Introduction

Mutations accumulate in human genomes throughout life. The majority are functionally neutral “passenger” mutations without effect on cell viability. Accumulation of the few “driver” mutations that positively affect cell viability can cause cancer, as they enhance cells’ ability to proliferate, escape apoptosis, and eventually metastasize[1,2,3]. Driver mutations increase the relative fitness of cancer cells and increase in abundance through positive selection. Driver discovery has focused on the identification of driver genes using whole-exome sequencing data. Whole-genome sequencing has enabled exploration of the 98% of the human genome that is non-coding. This has only led to the discovery of a few well-confirmed non-coding cancer elements. The best-known example is the promoter of the oncogene TERT, which is involved in the elongation of the DNA telomere ends during replication[4,5]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call