A site specific model and analysis of the neutral somatic mutation rate in whole-genome cancer data

Johanna Bertl,Morten Muhlig Nielsen,Jakob Skou Pedersen,Henrik Hornshøj,Asger Hobolth,Malene Juul,Qianyun Guo,Søren Besenbacher

doi:10.1186/s12859-018-2141-2

Johanna Bertl, Morten Muhlig Nielsen + Show 6 more

Open Access

https://doi.org/10.1186/s12859-018-2141-2

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Apr 19, 2018
Citations: 10	License type: open-access

Affiliation: Aarhus University

Abstract

BackgroundDetailed modelling of the neutral mutational process in cancer cells is crucial for identifying driver mutations and understanding the mutational mechanisms that act during cancer development. The neutral mutational process is very complex: whole-genome analyses have revealed that the mutation rate differs between cancer types, between patients and along the genome depending on the genetic and epigenetic context. Therefore, methods that predict the number of different types of mutations in regions or specific genomic elements must consider local genomic explanatory variables. A major drawback of most methods is the need to average the explanatory variables across the entire region or genomic element. This procedure is particularly problematic if the explanatory variable varies dramatically in the element under consideration.ResultsTo take into account the fine scale of the explanatory variables, we model the probabilities of different types of mutations for each position in the genome by multinomial logistic regression. We analyse 505 cancer genomes from 14 different cancer types and compare the performance in predicting mutation rate for both regional based models and site-specific models. We show that for 1000 randomly selected genomic positions, the site-specific model predicts the mutation rate much better than regional based models.We use a forward selection procedure to identify the most important explanatory variables. The procedure identifies site-specific conservation (phyloP), replication timing, and expression level as the best predictors for the mutation rate. Finally, our model confirms and quantifies certain well-known mutational signatures.ConclusionWe find that our site-specific multinomial regression model outperforms the regional based models. The possibility of including genomic variables on different scales and patient specific variables makes it a versatile framework for studying different mutational mechanisms. Our model can serve as the neutral null model for the mutational process; regions that deviate from the null model are candidates for elements that drive cancer development.

Highlights

Detailed modelling of the neutral mutational process in cancer cells is crucial for identifying driver mutations and understanding the mutational mechanisms that act during cancer development
Many studies have investigated what determines the mutation rate and what kind of models should be used [5] modeled the mutation heterogeneity using local regression with expression level and replication timing as explanatory variables [4] applied random forest regression on mutation counts in 1Mb windows using histone modifications and the density of DNase I hypersensitive sites. [6] predicted the number of mutations per element by a beta-binomial distribution using replication timing and noncoding annotations such as promoter, UTR and ultraconserved sites. Unlike these approaches that segmented the genome into regions according to the explanatory variables and estimated separate models for them, in a site-specific regression model, this division is not necessary [7] implemented a Poisson-binomial model on 50kb windows, where they used logistic regression to predict the position-specific mutation probability, based on basepair, replication timing and the presence and type of transcript
The mutation rate differs between samples from the same cancer type, with the largest variation seen for skin cutaneous melanoma

Summary

Introduction

Detailed modelling of the neutral mutational process in cancer cells is crucial for identifying driver mutations and understanding the mutational mechanisms that act during cancer development. Methods that predict the number of different types of mutations in regions or specific genomic elements must consider local genomic explanatory variables. [6] predicted the number of mutations per element by a beta-binomial distribution using replication timing and noncoding annotations such as promoter, UTR and ultraconserved sites Unlike these approaches that segmented the genome into regions according to the explanatory variables and estimated separate models for them, in a site-specific regression model, this division is not necessary [7] implemented a Poisson-binomial model on 50kb windows, where they used logistic regression to predict the position-specific mutation probability, based on basepair, replication timing and the presence and type of transcript. We include an explanatory variable to mask the repeat regions in the genome as mutation calls at repeat regions are biased due to technical reasons

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A site specific model and analysis of the neutral somatic mutation rate in whole-genome cancer data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Author response: Spontaneous mutations and the origin and maintenance of quantitative genetic variation
Wen Huang ... Mary Anna Carbone
-
Wen Huang, et. al.Wen Huang ... Mary Anna Carbone
09 May 2016
09 May 2016

Decision letter: Limited role of generation time changes in driving the evolution of the mutation spectrum in humans
Laurent Duret ... George H Perry
-
Laurent Duret, et. al.Laurent Duret ... George H Perry
22 Aug 2022
22 Aug 2022

Editor's evaluation: Limited role of generation time changes in driving the evolution of the mutation spectrum in humans
Philipp W Messer
-
Philipp W MesserPhilipp W Messer
22 Aug 2022
22 Aug 2022

Author response: Limited role of generation time changes in driving the evolution of the mutation spectrum in humans
Ziyue Gao ... Yulin Zhang
-
Ziyue Gao, et. al.Ziyue Gao ... Yulin Zhang
16 Jan 2023
16 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A site specific model and analysis of the neutral somatic mutation rate in whole-genome cancer data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics