Abstract

A major challenge in interpreting the large volume of mutation data identified by next-generation sequencing (NGS) is to distinguish driver mutations from neutral passenger mutations to facilitate the identification of targetable genes and new drugs. Current approaches are primarily based on mutation frequencies of single-genes, which lack the power to detect infrequently mutated driver genes and ignore functional interconnection and regulation among cancer genes. We propose a novel mutation network method, VarWalker, to prioritize driver genes in large scale cancer mutation data. VarWalker fits generalized additive models for each sample based on sample-specific mutation profiles and builds on the joint frequency of both mutation genes and their close interactors. These interactors are selected and optimized using the Random Walk with Restart algorithm in a protein-protein interaction network. We applied the method in >300 tumor genomes in two large-scale NGS benchmark datasets: 183 lung adenocarcinoma samples and 121 melanoma samples. In each cancer, we derived a consensus mutation subnetwork containing significantly enriched consensus cancer genes and cancer-related functional pathways. These cancer-specific mutation networks were then validated using independent datasets for each cancer. Importantly, VarWalker prioritizes well-known, infrequently mutated genes, which are shown to interact with highly recurrently mutated genes yet have been ignored by conventional single-gene-based approaches. Utilizing VarWalker, we demonstrated that network-assisted approaches can be effectively adapted to facilitate the detection of cancer driver genes in NGS data.

Highlights

  • Next-generation sequencing (NGS) technologies have enabled genome-wide identification of somatic mutations in large scale cancer samples

  • A cancer genome typically harbors both driver mutations, which contribute to tumorigenesis, and passenger mutations, which tend to be neutral and occur randomly

  • A major challenge in interpreting the large volume of mutation data identified in cancer genomes using nextgeneration sequencing (NGS) is to distinguish driver mutations from neutral passenger mutations

Read more

Summary

Introduction

Next-generation sequencing (NGS) technologies have enabled genome-wide identification of somatic mutations in large scale cancer samples. One major challenge in interpreting the large volume of mutation data is to distinguish ‘driver’ mutations from numerous neutral ‘passenger’ mutations to facilitate the identification of targetable genes and new drugs. The most widely adopted method is to search for highly frequently mutated genes within one cancer type [1,2]. Effective in many cases, frequency-based approaches suffer from disadvantages such as lack of power to detect infrequently mutated driver genes and failure to incorporate functional interconnections and regulations among genes. For a more comprehensive review, please refer to [3,4]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.