Abstract
The most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the "modeling by satisfaction of spatial restraints" strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program's predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER's objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance.
Highlights
In silico protein structure prediction constitutes an invaluable tool in Biomedical Research, since it allows to obtain structural information on a large number of proteins currently lacking an experimentally-determined 3D structure [1]
In this work we have not modified the MODELLER algorithm for σ values assignment, we propose strategies that could be likely pursued in the next-future in order to greatly increase the performance of the program
We show that the use of |Δdn| values for Gaussian homology-derived distance restraints (HDDRs) is supported by theory, as it can be analytically proven that they maximize the likelihood of obtaining a model in which each restrained dm is equal to its corresponding dn
Summary
In silico protein structure prediction constitutes an invaluable tool in Biomedical Research, since it allows to obtain structural information on a large number of proteins currently lacking an experimentally-determined 3D structure [1]. In the past years it has been considered as the most accurate one [2], but recently it has been shown that template-free strategies have reached comparable levels of performance with protein targets that lack good templates [3] (for example, with members of several membrane protein families [4]). Despite this fact, TBM methods, thanks to their speed, flexibility and growing template libraries [5], currently remain the instrument of choice for many researchers. The information of the templates is used to build a 3D atomic model of the target
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.