Atom-wise statistics and prediction of solvent accessibility in proteins

Y Hemajit Singh,M Michael Gromiha,Akinori Sarai,Shandar Ahmad

doi:10.1016/j.bpc.2006.06.013

Abstract

In this work, we explore a novel method to broaden the scope of sequence-based predictions of solvent accessibility or accessible surface area (ASA) to the atomic level. All 167 heavy atoms from the 20 types of amino acid residues in proteins have been studied. An analysis of ASA distribution of these atomic groups in different proteins has been performed and rotamer-style libraries have been developed. We observe that the ASA of some atomic groups (e.g., backbone C and N atoms) can be estimated from the sequence environment within a mean absolute error of 2–3 Å 2. However, some side chain atoms such as CG in Pro, NH1 in Arg and NE2 in Gln show a strong variability making it more difficult to estimate their ASA from sequence environment. In general, the prediction of ASA becomes more difficult for atomic positions at the side chain extremities of long amino acid residues (aromatic side chain terminals being the exception). Several atomic groups are frequently exposed to solvent. Some of them have a bimodal distribution, suggesting two stable conformations in terms of their solvent exposure. More detailed understanding and prediction of solvent accessibility, i.e., at an atomic level is expected to help in bioinformatics approaches to structure prediction, functional relevance of atomic solvent accessibilities and other interaction analyses.

Full Text