ASH structure alignment package: Sensitivity and selectivity in domain classification

Daron M Standley,Haruki Nakamura,Hiroyuki Toh

doi:10.1186/1471-2105-8-116

Abstract

BackgroundStructure alignment methods offer the possibility of measuring distant evolutionary relationships between proteins that are not visible by sequence-based analysis. However, the question of how structural differences and similarities ought to be quantified in this regard remains open. In this study we construct a training set of sequence-unique CATH and SCOP domains, from which we develop a scoring function that can reliably identify domains with the same CATH topology and SCOP fold classification. The score is implemented in the ASH structure alignment package, for which the source code and a web service are freely available from the PDBj website .ResultsThe new ASH score shows increased selectivity and sensitivity compared with values reported for several popular programs using the same test set of 4,298,905 structure pairs, yielding an area of .96 under the receiver operating characteristic (ROC) curve. In addition, weak sequence homologies between similar domains are revealed that could not be detected by BLAST sequence alignment. Also, a subset of domain pairs is identified that exhibit high similarity, even though their CATH and SCOP classification differs. Finally, we show that the ranking of alignment programs based solely on geometric measures depends on the choice of the quality measure.ConclusionASH shows high selectivity and sensitivity with regard to domain classification, an important step in defining distantly related protein sequence families. Moreover, the CPU cost per alignment is competitive with the fastest programs, making ASH a practical option for large-scale structure classification studies.

Highlights

Structure alignment methods offer the possibility of measuring distant evolutionary relationships between proteins that are not visible by sequence-based analysis
We present a streamlined version of the methodology that is released both as a web service and as a suite of command-line programs, including: a faster version of GASH; a streamlined alignment program for more rapid pair-wise alignment (RASH); a batch version of RASH for processing a list of templates (LASH), and a utility program for converting Protein Data Bank (PDB)-formatted files to the native data structure used by ASH (CONVERT)
Several improvements have been made to reduce the computational time of all the above programs: In order to avoid over-fitting of the amino acid substitution matrix, cross validation was performed to ensure that the result was not sensitive to the exclusion and any single domain in the training set

Summary

Results

The new ASH score shows increased selectivity and sensitivity compared with values reported for several popular programs using the same test set of 4,298,905 structure pairs, yielding an area of .96 under the receiver operating characteristic (ROC) curve. Weak sequence homologies between similar domains are revealed that could not be detected by BLAST sequence alignment. A subset of domain pairs is identified that exhibit high similarity, even though their CATH and SCOP classification differs. We show that the ranking of alignment programs based solely on geometric measures depends on the choice of the quality measure

Conclusion

Background

Results and Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Apr 4, 2007
Citations: 62	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

ASH structure alignment package: Sensitivity and selectivity in domain classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Comprehensive Evaluation of Protein Structure Alignment Methods: Scoring by Geometric Measures
Rachel Kolodny ... Michael Levitt
Journal of Molecular Biology | VOL. 346
Rachel Kolodny, et. al.Rachel Kolodny ... Michael Levitt
31 Dec 2004
Journal of Molecular Biology | VOL. 346

Computational discovery of direct associations between GO terms and protein domains
Seyed Ziaeddin Alborzi ... Marie-Dominique Devignes
BMC Bioinformatics | VOL. 19
Seyed Ziaeddin Alborzi, et. al.Seyed Ziaeddin Alborzi ... Marie-Dominique Devignes
01 Nov 2018
BMC Bioinformatics | VOL. 19

A fast structural multiple alignment method for long RNA sequences
Yasuo Tabei ... Kiyoshi Asai
BMC Bioinformatics | VOL. 9
Yasuo Tabei, et. al.Yasuo Tabei ... Kiyoshi Asai
23 Jan 2008
BMC Bioinformatics | VOL. 9

Domain Classification by Granular Computing used in a P2P Approach for Web Service Discovery
Lican Huang ... Yong Liu
-
Lican Huang, et. al.Lican Huang ... Yong Liu
01 Sep 2008
01 Sep 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ASH structure alignment package: Sensitivity and selectivity in domain classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics