Abstract

Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes.

Highlights

  • Protein tertiary structure determines molecular function, interaction, and stability of the protein, distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer

  • Protein 3D structures were downloaded from the Protein Data Bank (PDB), and the amino acid sequence of the protein structure was aligned to that of the reference genome using MAFFT9

  • Fifty-one genes were uniquely identified by our method, which included the FGFR2, HRAS, NFE2L2 and DICER1 genes These results suggest that the 3D permutation method can compliment the gene burden test and the previous methods to detect mutation clusters, and identify new candidates, analysis is restricted to genes with 3D structures available

Read more

Summary

Introduction

Protein tertiary structure determines molecular function, interaction, and stability of the protein, distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes. Statistical comparison between the number of observed and expected mutations, i.e. gene burden test, has been performed and many driver genes have been discovered[3,4]. This framework has been used to successfully identify new driver genes and pathways, recent studies reveal that most cancers are very heterogeneous, and many low-frequency driver genes exist in the long-tailed mutated gene lists[3,4]. We consider that our method can complement traditional gene burden tests and contribute to a better interpretation of mutations in cancer genomes

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call