Abstract

Despite interest in associating polymorphisms with clinical or experimental phenotypes, functional interpretation of mutation data has lagged behind generation of data from modern high-throughput techniques and the accurate prediction of the molecular impact of a mutation remains a non-trivial task. We present here an integrated knowledge-driven computational workflow designed to evaluate the effects of experimental and disease missense mutations on protein structure and interactions. We exemplify its application with analyses of saturation mutagenesis of DBR1 and Gal4 and show that the experimental phenotypes for over 80% of the mutations correlate well with predicted effects of mutations on protein stability and RNA binding affinity. We also show that analysis of mutations in VHL using our workflow provides valuable insights into the effects of mutations, and their links to the risk of developing renal carcinoma. Taken together the analyses of the three examples demonstrate that structural bioinformatics tools, when applied in a systematic, integrated way, can rapidly analyse a given system to provide a powerful approach for predicting structural and functional effects of thousands of mutations in order to reveal molecular mechanisms leading to a phenotype. Missense or non-synonymous mutations are nucleotide substitutions that alter the amino acid sequence of a protein. Their effects can range from modifying transcription, translation, processing and splicing, localization, changing stability of the protein, altering its dynamics or interactions with other proteins, nucleic acids and ligands, including small molecules and metal ions. The advent of high-throughput techniques including sequencing and saturation mutagenesis has provided large amounts of phenotypic data linked to mutations. However, one of the hurdles has been understanding and quantifying the effects of a particular mutation, and how they translate into a given phenotype. One approach to overcome this is to use robust, accurate and scalable computational methods to understand and correlate structural effects of mutations with disease.

Highlights

  • Over the past twenty years, multiple in silico approaches to predict how mutations affect protein stability have been developed based on various evolutionary and physicochemical hypotheses

  • A Random Forest binary classifier was trained using stability and protein-protein affinity change predictions from mCSM-Stability, SDM, DUET and mCSM-PPI. clear cell renal cell carcinoma (ccRCC) risk was predicted with 98% sensitivity and 93% specificity, which is consistent with the results described by Gossage et al.[36]

  • We show that our workflow can predict the relative changes mediated by alterations in protein stability and interactions, so providing the opportunity to understand in greater detail the effects of mutations and how they relate to the phenotypes we observe

Read more

Summary

Introduction

Over the past twenty years, multiple in silico approaches to predict how mutations affect protein stability have been developed based on various evolutionary and physicochemical hypotheses These include methods that seek to understand the effects of amino acid substitutions from the protein sequence alone, and those that exploit the extensive structural information available for many proteins. Findlay and colleagues reported in Nature a CRISPR/Cas[9] cleavage system coupled with multiplex homology-directed repair to perform saturation editing of a conserved 25 amino acid region of the RNA lariat debranching enzyme DBR1, an essential gene They coupled this to a growth-based assay to evaluate the phenotypic effects of over 170 distinct missense mutations. We applied our workflow in an attempt to analyse the molecular mechanism underlying the phenotypic effects of these mutations (Figure S1)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call