Abstract

Identifying deleterious mutations remains a challenge in cancer genome sequencing projects, reflecting the vast number of candidate mutations per tumour and the existence of interpatient heterogeneity. Based on a 3D protein interaction network profiled via large-scale cross-linking mass spectrometry, we propose a weighted average formula involving the combination of three types of information into a ‘meta-score’. We assume that a single amino acid polymorphism (SAP) may have a deleterious effect if the mutation rarely occurs naturally during evolution, if it inhibits binding between a pair of interacting proteins when located at their interface, or if it plays an important role in a protein interaction (PPI) network. Cross-validation indicated that this new method presents an AUC value of 0.93 and outperforms other widely used tools. The application of this method to the CPTAC colorectal cancer dataset enabled the accurate identification of validated deleterious mutations and yielded insights into their potential pathogenesis. Survival analysis showed that the accumulation of deleterious SAPs is significantly associated with a poor prognosis. The new method provides an alternative method to identifying and ranking deleterious cancer SAPs based on a 3D PPI network and will contribute to the understanding of pathogenesis and the discovery of prognostic biomarkers.

Highlights

  • The accumulation of DNA mutations can cause cancer[1], when these mutations occur in coding regions and lead to single amino acid substitutions[2,3]

  • We identified single amino acid polymorphism (SAP) located at the interface between pairs of interacting proteins identified based on cross-linking experiments and INstruct data, as these mutations may disrupt protein interactions

  • We developed an integrative approach referred to as network-integrated risk predictor of somatic SAPs (NIPS), employing a meta-score to evaluate the risk of SAPs computationally and identify deleterious SAPs in cancer by combing information on protein-protein interaction (PPI) 3D structure (I-score), network topology (T-score), and sequence conservation (S-score)

Read more

Summary

Introduction

The accumulation of DNA mutations can cause cancer[1], when these mutations occur in coding regions and lead to single amino acid substitutions[2,3]. Among the 2,000,000 coding mutations described in COSMIC (version 70), most mutations have no effect on disease development[6], and only a few of these changes are closely associated with or lead to cancer These changes are referred to as deleterious mutations or, at the protein level, deleterious single amino acid polymorphisms (SAPs)[7,8]. As a cancer-specific tool, CHASM (cancer-specific high-throughput annotation of somatic mutations) is a major machine-learning approach employing a random forest algorithm[17] and was trained using 49 predictive features, including conservation exon information, UniProt annotations and the frequency of missense changes in the COSMIC database[6,18]. We describe a new method, referred to as NIPS, that integrates 3D interface interactions, network topology and information on sequence evolution to determine which mutations identified in cancer genomes are likely to be deleterious. Users can discover new deleterious SAPs and markers related to the prognosis of cancer using NIPS

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.