Abstract

BackgroundCommon structural biology methods (i.e., NMR and molecular dynamics) often produce ensembles of molecular structures. Consequently, averaging of 3D coordinates of molecular structures (proteins and RNA) is a frequent approach to obtain a consensus structure that is representative of the ensemble. However, when the structures are averaged, artifacts can result in unrealistic local geometries, including unphysical bond lengths and angles.ResultsHerein, we describe a method to derive representative structures while limiting the number of artifacts. Our approach is based on a Monte Carlo simulation technique that drives a starting structure (an extended or a 'close-by' structure) towards the 'averaged structure' using a harmonic pseudo energy function. To assess the performance of the algorithm, we applied our approach to Cα models of 1364 proteins generated by the TASSER structure prediction algorithm. The average RMSD of the refined model from the native structure for the set becomes worse by a mere 0.08 Å compared to the average RMSD of the averaged structures from the native structure (3.28 Å for refined structures and 3.36 A for the averaged structures). However, the percentage of atoms involved in clashes is greatly reduced (from 63% to 1%); in fact, the majority of the refined proteins had zero clashes. Moreover, a small number (38) of refined structures resulted in lower RMSD to the native protein versus the averaged structure. Finally, compared to PULCHRA [1], our approach produces representative structure of similar RMSD quality, but with much fewer clashes.ConclusionThe benchmarking results demonstrate that our approach for removing averaging artifacts can be very beneficial for the structural biology community. Furthermore, the same approach can be applied to almost any problem where averaging of 3D coordinates is performed. Namely, structure averaging is also commonly performed in RNA secondary prediction [2], which could also benefit from our approach.

Highlights

  • Common structural biology methods (i.e., NMR and molecular dynamics) often produce ensembles of molecular structures

  • Upon application of the MCORE algorithm to the refine the COMBO models, it is found that the refined representative structures have similar root mean square deviation (RMSD) values to native structures, but with far fewer unphysical characteristics

  • We found that in most of the cases where there was a large deviation between the COMBO RMSD and MCORE RMSD, there was involvement of unphysical bond lengths which reiterates the fact that there is tradeoff between local geometric correctness and deviation from the target structure

Read more

Summary

Introduction

Common structural biology methods (i.e., NMR and molecular dynamics) often produce ensembles of molecular structures. Averaging of 3D coordinates of molecular structures (proteins and RNA) is a frequent approach to obtain a consensus structure that is representative of the ensemble. Methods for the experimental or theoretical determination of protein structures often output their results as an ensemble. One way to generate this 'consensus' or 'representative structure' is to calculate the centroid structure by averaging the Cartesian coordinates of the ensemble of superimposed structures. Zagrovic et al [4] proposed the "mean-structure hypothesis" which states that the geometry of the collapsed unfolded state of small peptides and proteins in an average sense corresponds to the geometry of the native structure at equilibrium. It was argued that finding average distance matrices and using distance based root mean square deviation as a metric may be one way to capture the relevant features of ensembles of structure and compare them with other reference structures

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.