Distant homology recognition using structural classification of proteins.

Alexey G Murzin,Alex Bateman

doi:10.1002/(sici)1097-0134(1997)1+<105::aid-prot14>3.3.co;2-1

Abstract

Protein structure prediction is arguably the biggest unsolved problem of structural biology. The notion of the number of naturally occurring different protein folds being limited allows partial solution of this problem by the use of fold recognition methods, which "thread" the sequence in question through a library of known protein folds. The fold recognition methods were thought to be superior to the distant homology recognition methods when there is no significant sequence similarity to known structures. We show here that the Structural Classification of Proteins (SCOP) database, organizing all known protein folds according their structural and evolutionary relationships, can be effectively used to enhance the sensitivity of the distant homology recognition methods to rival the "threading" methods. In the CASP2 experiment, our approach correctly assigned into existing SCOP superfamilies all of the six "fold recognition" targets we attempted. For each of the six targets, we correctly predicted the homologous protein with a very similar structure; often, it was the most similar structure. We correctly predicted local alignments of the sequence features that we found to be characteristic for the protein superfamily containing a given target. Our global alignments, extended manually from these local alignments, also appeared to be rather accurate.

Full Text