Abstract

Background and Objectives: Recent next generation sequencing studies of different cancers reveal a varied spectrum of mutations with patients having dozens to hundreds of mutations with little overlap in mutations between different patients. A difficult problem is to understand which of the observed mutations contribute to tumorigenesis. While several approaches have been used to determine significantly mutated genes, these approaches do not calculate random expectation mutation frequencies. Here, we develop a simulation of random mutagenesis and compare observed mutation frequencies (Fo) to random expectation frequencies (Fr) to identify genes that are likely to be selected for or against in tumors. Methods: Our random simulation method, implemented in Matlab, applies a different mutation probability for A or T bases, G or C bases and CpG repeats. Mutation simulation was done on Agilent's 50 Mb exon library. Mutations are reported on a gene level. Each simulation was run for 100 trials with each trial consisting of 316 repeats. The 100 trials enables calculation of standard deviations for random mutation frequencies. We also carried out a bootstrap analysis of observed data to estimate standard deviations of observed mutations. We calculated differences between Fo and Fr using ratio and rank comparisons. Results: We applied our approach to ovarian cancer data reported by the The Cancer Genome Atlas in 2011. We found a significant difference between observed mutations and random expectation mutations. Random mutations correlate well with the length of a gene (R2=0.54 for one trial) while almost no correlation is seen with observed mutations (R2=0.11). We also found one set of genes that is mutated at higher frequencies than expected while another set of genes is mutated at lower frequencies than expected. Conclusions: Our simulated mutagenesis method is a novel approach to determining significance of observed mutations in cancer. Application of this model to ovarian cancer data shows a significant discrepancy between expected and observed mutations possibly indicating that most observed gene mutations are selected for. In addition, our approach reveals specific genes that may be implicated in tumorigenesis. These genes are good candidates for further study.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.