Abstract

An automatic algorithm based on inter-residue contacts is presented to identify domains in proteins. The results of the algorithm are compared to an assignment performed by inspection that was guided by the authors' description in the literature. The authors' and the algorithm's assignments for a chain were considered to agree if the same number of domains were identified and if the assignments were the same for at least 95% of the residues. With this criterion, the algorithm agreed with the authors' assignment for 78% of the 284 non-redundant chains considered. When some of the authors' assignments were re-evaluated based on the results of the algorithm, an agreement of 84% was obtained. The algorithm is therefore a useful tool for data validation in domain assignment. The authors assignments of domains were analysed for structural principles of domains. The number of chains forming one, two, three, four and five domains are 197, 67, 13, 6 and 1 respectively. Most domains in multidomain proteins are formed from continuous segments and adopt the same structural class. Distributions of the number of residues and the ellipticity of domains and chains are presented. The relationship between accessible surface area and molecular weight for domains and chains is examined.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call