Abstract

BackgroundMassively parallel sequencing studies have led to the identification of a large number of mutations present in a minority of cancers of a given site. Hence, methods to identify the likely pathogenic mutations that are worth exploring experimentally and clinically are required. We sought to compare the performance of 15 mutation effect prediction algorithms and their agreement. As a hypothesis-generating aim, we sought to define whether combinations of prediction algorithms would improve the functional effect predictions of specific mutations.ResultsLiterature and database mining of single nucleotide variants (SNVs) affecting 15 cancer genes was performed to identify mutations supported by functional evidence or hereditary disease association to be classified either as non-neutral (n = 849) or neutral (n = 140) with respect to their impact on protein function. These SNVs were employed to test the performance of 15 mutation effect prediction algorithms. The accuracy of the prediction algorithms varies considerably. Although all algorithms perform consistently well in terms of positive predictive value, their negative predictive value varies substantially. Cancer-specific mutation effect predictors display no-to-almost perfect agreement in their predictions of these SNVs, whereas the non-cancer-specific predictors showed no-to-moderate agreement. Combinations of predictors modestly improve accuracy and significantly improve negative predictive values.ConclusionsThe information provided by mutation effect predictors is not equivalent. No algorithm is able to predict sufficiently accurately SNVs that should be taken forward for experimental or clinical testing. Combining algorithms aggregates orthogonal information and may result in improvements in the negative predictive value of mutation effect predictions.

Highlights

  • Parallel sequencing studies have led to the identification of a large number of mutations present in a minority of cancers of a given site

  • The Cancer Genome Atlas (TCGA), the International Cancer Genome Consortium (ICGC) and endeavors led by individual investigators have demonstrated that the repertoire of genes affected by highly recurrent mutations is limited and that there is a large collection of genes affected by mutations in 1%

  • Given that PolyPhen-2, MutationTaster, Cancer Driver Annotation (CanDrA), and CONsensus DELeteriousness score of missense mutations (Condel) can only define the potential functional impact of single nucleotide variants (SNVs), dinucleotide and trinucleotide changes were excluded from this study

Read more

Summary

Introduction

Parallel sequencing studies have led to the identification of a large number of mutations present in a minority of cancers of a given site. Parallel sequencing studies have demonstrated that tumors can be regarded as genetically heterogeneous populations of individual clones that accumulate mutations during the process of tumorigenesis and tumor progression [1]. These mutations, likely the result of genetic instability, may confer a selective growth advantage and be causally implicated in carcinogenesis (that is, driver mutations), or are either selectively neutral

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call