Abstract

Systematic interrogation of mutation or protein modification data is important to identify sites with functional consequences and to deduce global consequences from large data sets. Mechismo (mechismo.russellab.org) enables simultaneous consideration of thousands of 3D structures and biomolecular interactions to predict rapidly mechanistic consequences for mutations and modifications. As useful functional information often only comes from homologous proteins, we benchmarked the accuracy of predictions as a function of protein/structure sequence similarity, which permits the use of relatively weak sequence similarities with an appropriate confidence measure. For protein–protein, protein–nucleic acid and a subset of protein–chemical interactions, we also developed and benchmarked a measure of whether modifications are likely to enhance or diminish the interactions, which can assist the detection of modifications with specific effects. Analysis of high-throughput sequencing data shows that the approach can identify interesting differences between cancers, and application to proteomics data finds potential mechanistic insights for how post-translational modifications can alter biomolecular interactions.

Highlights

  • High-throughput sequencing (HTS) has led to the systematic identification of thousands of protein variants [1] from which the aim is to identify those most likely to impact biological systems or cause disease

  • Analysis of high-throughput sequencing data shows that the approach can identify interesting differences between cancers, and application to proteomics data finds potential mechanistic insights for how post-translational modifications can alter biomolecular interactions

  • The mechanistic basis of why particular changes in proteins have the effect that they do is one of the great challenges in biology and utterly requires a deeper integration of HTS and proteomics techniques with information related to protein 3D structures

Read more

Summary

Introduction

High-throughput sequencing (HTS) has led to the systematic identification of thousands of protein variants [1] from which the aim is to identify those most likely to impact biological systems or cause disease. Advances in proteomics have produced data sets of thousands of posttranslational modifications (PTMs) [2] aiming to find those of biomedical consequence. These are just two examples of the wider trend in life science research where data generation is often faster than interpretation, making tools for aiding the ranking and analysis of such findings of increasing importance. The current flood of variant and modification data is concurrent with a growing set of protein 3D structures and interactions. All protein domains have at least one representative structure, and the number of interactions for which structures are known or modelled grows continuously [3,4,5,6,7]. Intense interaction discovery efforts provide an increasingly complete set of biomolecular interactions (e.g. [8,9]) and there are tens of thousands of interactions known for most of the major model organisms

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.