Abstract

BackgroundProtein structure analysis and comparison are major challenges in structural bioinformatics. Despite the existence of many tools and algorithms, very few of them have managed to capture the intuitive understanding of protein structures developed in structural biology, especially in the context of rapid database searches. Such intuitions could help speed up similarity searches and make it easier to understand the results of such analyses.ResultsWe developed a TOPS++FATCAT algorithm that uses an intuitive description of the proteins' structures as captured in the popular TOPS diagrams to limit the search space of the aligned fragment pairs (AFPs) in the flexible alignment of protein structures performed by the FATCAT algorithm. The TOPS++FATCAT algorithm is faster than FATCAT by more than an order of magnitude with a minimal cost in classification and alignment accuracy. For beta-rich proteins its accuracy is better than FATCAT, because the TOPS+ strings models contains important information of the parallel and anti-parallel hydrogen-bond patterns between the beta-strand SSEs (Secondary Structural Elements). We show that the TOPS++FATCAT errors, rare as they are, can be clearly linked to oversimplifications of the TOPS diagrams and can be corrected by the development of more precise secondary structure element definitions.Software AvailabilityThe benchmark analysis results and the compressed archive of the TOPS++FATCAT program for Linux platform can be downloaded from the following web site: ConclusionTOPS++FATCAT provides FATCAT accuracy and insights into protein structural changes at a speed comparable to sequence alignments, opening up a possibility of interactive protein structure similarity searches.

Highlights

  • Protein structure analysis and comparison are major challenges in structural bioinformatics

  • We explore the question of whether it would be possible to combine insights provided by topology diagrams into automated protein structure alignment algorithms, focusing on the FATCAT program developed previously in our group

  • Receiver Operating Characteristics (ROC) and AUC Analyses We have compared the performance of the TOPS++FATCAT method against the original FATCAT method using the SCOP classification information at the superfamily level

Read more

Summary

Introduction

Protein structure analysis and comparison are major challenges in structural bioinformatics. But synergistic strategies are typically used for this purpose In classification systems such as SCOP [1] or CATH [2], human intuition is used to simplify the description of protein structures to a manageable size, and a human eye, sometimes supported by automated analysis, can recognize patterns and types of structures. In the second approach, specialized comparison algorithms, such as DALI [3], CE [4], or FATCAT [5] can be used to calculate a distance-like metric in the protein structure space. This in turn can be used to cluster proteins into groups. Many such algorithms have been developed over the past few decades and have been mostly used for the classification of protein structures into families

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.