Abstract

Quality assessment (QA) for predicted protein structural models is an important and challenging research problem in protein structure prediction. Consensus Global Distance Test (CGDT) methods assess each decoy (predicted structural model) based on its structural similarity to all others in a decoy set and has been proved to work well when good decoys are in a majority cluster. Scoring functions evaluate each single decoy based on its structural properties. Both methods have their merits and limitations. In this paper, we present a novel method called PWCom, which consists of two neural networks sequentially to combine CGDT and single model scoring methods such as RW, DDFire and OPUS-Ca. Specifically, for every pair of decoys, the difference of the corresponding feature vectors is input to the first neural network which enables one to predict whether the decoy-pair are significantly different in terms of their GDT scores to the native. If yes, the second neural network is used to decide which one of the two is closer to the native structure. The quality score for each decoy in the pool is based on the number of winning times during the pairwise comparisons. Test results on three benchmark datasets from different model generation methods showed that PWCom significantly improves over consensus GDT and single scoring methods. The QA server (MUFOLD-Server) applying this method in CASP 10 QA category was ranked the second place in terms of Pearson and Spearman correlation performance.

Highlights

  • Protein three-dimensional (3D) structures are of great significance for protein function analysis

  • In benchmark 2, Consensus Global Distance Test (CGDT) is significantly better than OPUS-Ca, DDFire, RW and Position Specific Probability Sum (PSPS) in terms of all three measures

  • Our new approach combines the advantages of consensus GDT method and single scoring functions through pairwise comparison and a two-stage machine-learning scheme

Read more

Summary

Introduction

Protein three-dimensional (3D) structures are of great significance for protein function analysis. Experimental methods for protein structure determination such as X-ray crystallography and nuclear magnetic resonance (NMR) are costly and time consuming. Computational prediction provides an economic way to bridge the increasing gap between the number of available protein primary sequences and 3D structures. It is a viable and efficient approach to study proteins [1]. Structural models with medium accuracy up to 6 Aof Root-Mean-Square Deviation (RMSD) are often useful for understanding protein functions [2]. Protein structure prediction from primary sequences is still a research challenge [3,4]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.