Abstract

BackgroundHuman cancer is caused by the accumulation of somatic mutations in tumor suppressors and oncogenes within the genome. In the case of oncogenes, recent theory suggests that there are only a few key “driver” mutations responsible for tumorigenesis. As there have been significant pharmacological successes in developing drugs that treat cancers that carry these driver mutations, several methods that rely on mutational clustering have been developed to identify them. However, these methods consider proteins as a single strand without taking their spatial structures into account. We propose an extension to current methodology that incorporates protein tertiary structure in order to increase our power when identifying mutation clustering.ResultsWe have developed iPAC (identification of Protein Amino acid Clustering), an algorithm that identifies non-random somatic mutations in proteins while taking into account the three dimensional protein structure. By using the tertiary information, we are able to detect both novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of clustering based on existing methods. For example, by combining the data in the Protein Data Bank (PDB) and the Catalogue of Somatic Mutations in Cancer, our algorithm identifies new mutational clusters in well known cancer proteins such as KRAS and PI3KC α. Further, by utilizing the tertiary structure, our algorithm also identifies clusters in EGFR, EIF2AK2, and other proteins that are not identified by current methodology. The R package is available at: http://www.bioconductor.org/packages/2.12/bioc/html/iPAC.html.ConclusionOur algorithm extends the current methodology to identify oncogenic activating driver mutations by utilizing tertiary protein structure when identifying nonrandom somatic residue mutation clusters.

Highlights

  • Human cancer is caused by the accumulation of somatic mutations in tumor suppressors and oncogenes within the genome

  • IPAC identified 3 new proteins as well, EGFR, EIF2AK2 and HAO1. These 3 new proteins correspond to 10 of the 215 structures found to have clustering. iPAC found structure 2ENQ for the protein PIK3CA to contain a significant cluster while Non-Random Mutational Clustering (NMC) did not

  • There were no proteins that were identified by NMC but were subsequently missed by the iPAC algorithm

Read more

Summary

Introduction

Human cancer is caused by the accumulation of somatic mutations in tumor suppressors and oncogenes within the genome. As there have been significant pharmacological successes in developing drugs that treat cancers that carry these driver mutations, several methods that rely on mutational clustering have been developed to identify them. These methods consider proteins as a single strand without taking their spatial structures into account. At its most basic, it is a genetic disease that is caused by the accumulation of somatic mutations in oncogenes and tumor suppressors [1]. Pharmacological intervention has shown to be more successful at inhibiting activating oncogenes than restoring tumor suppressing gene function. Mutational clusters that lead to either beneficial or detrimental phenotypic changes may point to regions that are under positive or directional selection as well as regions that are functionally significant and can be targeted by protein engineering [7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call