Abstract

BackgroundDetection of common evolutionary origin (homology) is a primary means of inferring protein structure and function. At present, comparison of protein families represented as sequence profiles is arguably the most effective homology detection strategy. However, finding the best way to represent evolutionary information of a protein sequence family in the profile, to compare profiles and to estimate the biological significance of such comparisons, remains an active area of research.ResultsHere, we present a new homology detection method based on sequence profile-profile comparison. The method has a number of new features including position-dependent gap penalties and a global score system. Position-dependent gap penalties provide a more biologically relevant way to represent and align protein families as sequence profiles. The global score system enables an analytical solution of the statistical parameters needed to estimate the statistical significance of profile-profile similarities. The new method, together with other state-of-the-art profile-based methods (HHsearch, COMPASS and PSI-BLAST), is benchmarked in all-against-all comparison of a challenging set of SCOP domains that share at most 20% sequence identity. For benchmarking, we use a reference ("gold standard") free model-based evaluation framework. Evaluation results show that at the level of protein domains our method compares favorably to all other tested methods. We also provide examples of the new method outperforming structure-based similarity detection and alignment. The implementation of the new method both as a standalone software package and as a web server is available at http://www.ibt.lt/bioinformatics/coma.ConclusionDue to a number of developments, the new profile-profile comparison method shows an improved ability to match distantly related protein domains. Therefore, the method should be useful for annotation and homology modeling of uncharacterized proteins.

Highlights

  • Detection of common evolutionary origin is a primary means of inferring protein structure and function

  • Protein sequence comparison is the primary means for establishing homology

  • We show that at the protein domain level COMA performs better than several other state-of-the art homology detection methods

Read more

Summary

Introduction

Detection of common evolutionary origin (homology) is a primary means of inferring protein structure and function. Comparison of multiple sequence alignments instead of individual sequences can often facilitate inference of remote homology relationships. This should not be surprising, because in contrast to a single sequence, a set of aligned related sequences can tell much more about the conservation (functional or structural importance) of individual positions or regions within the polypeptide chain. Multiple sequence alignments are generally converted into either of the two numerical forms: position-specific sequence profiles or Hidden Markov Models (HMMs) [1,2,3,4,5,6,7]. The main difference between the traditional sequence profile and HMM is that the latter incorporates position-specific insertion and deletion probabilities

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.