Abstract

BackgroundInference of remote homology between proteins is very challenging and remains a prerogative of an expert. Thus a significant drawback to the use of evolutionary-based protein structure classifications is the difficulty in assigning new proteins to unique positions in the classification scheme with automatic methods. To address this issue, we have developed an algorithm to map protein domains to an existing structural classification scheme and have applied it to the SCOP database.ResultsThe general strategy employed by this algorithm is to combine the results of several existing sequence and structure comparison tools applied to a query protein of known structure in order to find the homologs already classified in SCOP database and thus determine classification assignments. The algorithm is able to map domains within newly solved structures to the appropriate SCOP superfamily level with ~95% accuracy. Examples of correctly mapped remote homologs are discussed. The algorithm is also capable of identifying potential evolutionary relationships not specified in the SCOP database, thus helping to make it better. The strategy of the mapping algorithm is not limited to SCOP and can be applied to any other evolutionary-based classification scheme as well. SCOPmap is available for download.ConclusionThe SCOPmap program is useful for assigning domains in newly solved structures to appropriate superfamilies and for identifying evolutionary links between different superfamilies.

Highlights

  • Inference of remote homology between proteins is very challenging and remains a prerogative of an expert

  • The databases mentioned above are associated with automatic methods for identifying potential structural neighbors of a new protein query, they are often incapable of assigning domains to a unique position in the classification according to evolutionary relationships

  • Performance of individual comparison methods In order to assess the relative performance of the individual comparison tools used by SCOPmap, the number of assignments in the tweaking set gained by each additional comparison method was evaluated

Read more

Summary

Introduction

Inference of remote homology between proteins is very challenging and remains a prerogative of an expert. Several structural classification schemes such as SCOP [1], CATH [2], and Dali Domain Dictionary [3] have been developed for the purpose of cataloguing all available protein structures These databases are commonly used for studying structural and evolutionary relationships between proteins. Manual classification of protein structures remains the gold standard, the necessity for reliable automatic tools that can reproduce the results of such a classification scheme becomes increasingly apparent as available databases continue to grow in size. Such tools must be capable of detecting homology between distantly related proteins while keeping false positives at a minimum

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call