Abstract

The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower’s background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery.

Highlights

  • The discovery of driver mutations is one of the key motivations for cancer genome sequencing

  • Most state-of-the-art methods identify drivers by detecting signals of positive selection either through mutational burden tests, which compare the rate of mutations observed in a region of the genome to what is expected from the background mutation rate (BMR), or functional impact tests, which identify putative driver mutations based on a higher-than-expected rate of changes that are predicted to alter the function of genomic elements[3,6]

  • The method takes advantage of the large somatic mutation sets produced by whole-genome sequencing (WGS) technology to build an accurate global BMR model from more than a thousand genomic features

Read more

Summary

Introduction

The discovery of driver mutations is one of the key motivations for cancer genome sequencing. As part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. We combined the two mutation significance testing methods to develop DriverPower (Fig. 1a), a framework for identification of coding and non-coding cancer drivers using mutational burden and functional impact scores. The PCAWG Consortium aggregated WGS data from 2658 cancers across 38 tumour types generated by the ICGC and TCGA projects These sequencing data were reanalysed with standardised, high-accuracy pipelines to align to the human genome (reference build hs37d5) and identify germline variants and somatically acquired mutations, as described in ref

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call