Abstract
The contact number of an amino acid residue in a protein structure is defined by the number of C(beta) atoms around the C(beta) atom of the given residue, a quantity similar to, but different from, solvent accessible surface area. We present a method to predict the contact numbers of a protein from its amino acid sequence. The method is based on a simple linear regression scheme and predicts the absolute values of contact numbers. When single sequences are used for both parameter estimation and cross-validation, the present method predicts the contact numbers with a correlation coefficient of 0.555 on average. When multiple sequence alignments are used, the correlation increases to 0.627, which is a significant improvement over previous methods. In terms of discrete states prediction, the accuracies for 2-, 3-, and 10-state predictions are, respectively, 71.4%, 54.1%, and 18.9% with residue type-dependent unbiased thresholds, and 76.3%, 59.2%, and 21.8% with residue type-independent unbiased thresholds. The difference between accessible surface area and contact number from a prediction viewpoint and the application of contact number prediction to three-dimensional structure prediction are discussed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.