Abstract

Driver mutations are somatic mutations that provide growth advantage to tumor cells, while passenger mutations are those not functionally related to oncogenesis. Distinguishing drivers from passengers is challenging because drivers occur much less frequently than passengers, they tend to have low prevalence, their functions are multifactorial and not intuitively obvious. Missense mutations are excellent candidates as drivers, as they occur more frequently and are potentially easier to identify than other types of mutations. Although several methods have been developed for predicting the functional impact of missense mutations, only a few have been specifically designed for identifying driver mutations. As more mutations are being discovered, more accurate predictive models can be developed using machine learning approaches that systematically characterize the commonality and peculiarity of missense mutations under the background of specific cancer types. Here, we present a cancer driver annotation (CanDrA) tool that predicts missense driver mutations based on a set of 95 structural and evolutionary features computed by over 10 functional prediction algorithms such as CHASM, SIFT, and MutationAssessor. Through feature optimization and supervised training, CanDrA outperforms existing tools in analyzing the glioblastoma multiforme and ovarian carcinoma data sets in The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia project.

Highlights

  • We demonstrated that cancer driver annotation (CanDrA) achieved better sensitivity and specificity than other tools in predicting driver mutations in glioblastoma multiforme (GBM) and ovarian carcinoma (OVC)

  • CanDrA classifies a mutation into 3 categories: driver, no-call, and passenger, based on scores computed by the support vector machine (SVM) (Figure S1 in File S1) [36]

  • Among the 3 core features, CONDEL [25], a method that combines five features from SIFT, PolyPhen-2, MutationAssessor and other sources based on a set of 20,000 non-synonymous germline single nucleotide variants (SNVs) was shown to be the single best predictor on the GBM.Ex dataset, with an area under the curve (AUC) equal to 0.703

Read more

Summary

Introduction

At different stages of oncogenesis, a group of key mutations, called drivers, significantly alter the normal cellular system [2,3] and confer growth and survival advantages to tumor cells [4]. Due to the inherent genomic instability present in tumors, driver mutations occur on the background of a large number of mutations, called passengers, that are not functionally related to oncogenesis. The identification of driver mutations is a critical mission of cancer genomics. Research that interrogate specific driver mutations and their clinical implications are being widely conducted for multiple types of cancer [7,8], more efforts are demanded for systematic genome-wide characterization of driver mutations and their functional implications

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call