Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms

Sandeep Chakraborty,Basuthkar J Rao,Abhaya M Dandekar,Ravindra Venkatramani,Bjarni Asgeirsson

doi:10.12688/f1000research.2-211.v3

Sandeep Chakraborty, Basuthkar J Rao + Show 3 more

Open Access

https://doi.org/10.12688/f1000research.2-211.v3

Copy DOI

Abstract

Predicting the three dimensional native state structure of a protein from its primary sequence is an unsolved grand challenge in molecular biology. Two main computational approaches have evolved to obtain the structure from the protein sequence - ab initio/de novo methods and template-based modeling - both of which typically generate multiple possible native state structures. Model quality assessment programs (MQAP) validate these predicted structures in order to identify the correct native state structure. Here, we propose a MQAP for assessing the quality of protein structures based on the distances of consecutive Cα atoms. We hypothesize that the root-mean-square deviation of the distance of consecutive Cα (RDCC) atoms from the ideal value of 3.8 Å, derived from a statistical analysis of high quality protein structures (top100H database), is minimized in native structures. Based on tests with the top100H set, we propose a RDCC cutoff value of 0.012 Å, above which a structure can be filtered out as a non-native structure. We applied the RDCC discriminator on decoy sets from the Decoys 'R' Us database to show that the native structures in all decoy sets tested have RDCC below the 0.012 Å cutoff. While most decoy sets were either indistinguishable using this discriminator or had very few violations, all the decoy structures in the fisa decoy set were discriminated by applying the RDCC criterion. This highlights the physical non-viability of the fisa decoy set, and possible issues in benchmarking other methods using this set. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134 .

Highlights

The structure of a protein is a veritable source of information about its physiological relevance in the cellular context[1]
There are essentially two different computational approaches to predict protein structures from its primary sequence: 1) Template based methods (TBM) which are based on features obtained from the database of known protein structures[2,3,4] and 2) ab initio or de novo methods which are based on the intrinsic laws governing atomic interactions and are applicable in the absence of a template structure with significant sequence homology[5,6]
Selecting the best candidate from the set of putative structures is an essential aspect that is performed by model quality assessment programs (MQAP)

Summary

Introduction

The structure of a protein is a veritable source of information about its physiological relevance in the cellular context[1]. The Boltzmann hypothesis states that if the database of known native protein structures is assumed to be a statistical system in thermodynamic equilibrium, specific structural features would be populated based on the free energy of the protein conformational state. Sippl argued using a converse logic that the frequencies of occurrence of structural features such as interatomic distances in the database of known protein structures could determine a free energy (potential of mean force) for a given protein conformation, and be used to discriminate the native structure[18,19]. In a set of high quality protein structures (top100H27), we demonstrate that the distance between consecutive Ca atoms are distributed normally with a mean of 3.8 Å and standard deviation of 0.04 Å Based on this observation, the reference state for our statistical potential calculations is defined as one where all consecutive Ca atoms are 3.8 Å apart. We propose a simple and fast discriminator for protein structure quality based on the distance profiles of consecutive backbone Ca atoms that identifies decoy structures that are physically nonviable

Results and discussion

Materials and methods

Zhang Y

15. McGuffin LJ

19. Sippl MJ

32. Vriend G

47. Tosatto SC

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: F1000Research	Publication Date: Dec 17, 2013
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research

Lead the way for us

Similar Papers

Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
Basuthkar J Rao ... Abhaya M Dandekar
F1000Research | VOL. 2
Basuthkar J Rao, et. al.Basuthkar J Rao ... Abhaya M Dandekar
10 Oct 2013
F1000Research | VOL. 2

Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
Bairong Shen ... Basuthkar J Rao
F1000Research | VOL. 2
Bairong Shen, et. al.Bairong Shen ... Basuthkar J Rao
18 Nov 2013
F1000Research | VOL. 2

Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms
Sandeep Chakraborty ... Bjarni Asgeirsson
F1000Research | VOL. 2
Sandeep Chakraborty, et. al.Sandeep Chakraborty ... Bjarni Asgeirsson
21 Nov 2013
F1000Research | VOL. 2

Crystal structure of a PduO‐type ATP:cobalamin adenosyltransferase from Burkholderia thailandensis
Young Min Chi ... Heenam Stanley Kim
Proteins: Structure, Function, and Bioinformatics | VOL. 72
Young Min Chi, et. al.Young Min Chi ... Heenam Stanley Kim
12 May 2008
Proteins: Structure, Function, and Bioinformatics | VOL. 72

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Protein structure quality assessment based on the distance profiles of consecutive backbone Cα atoms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research