Abstract

BackgroundIn structural bioinformatics, there is an increasing interest in identifying and understanding the evolution of local protein structures regarded as key structural or functional protein building blocks. A central need is then to compare these, possibly short, fragments by measuring efficiently and accurately their (dis)similarity. Progress towards this goal has given rise to scores enabling to assess the strong similarity of fragments. Yet, there is still a lack of more progressive scores, with meaningful intermediate values, for the comparison, retrieval or clustering of distantly related fragments.ResultsWe introduce here the Amplitude Spectrum Distance (ASD), a novel way of comparing protein fragments based on the discrete Fourier transform of their C α distance matrix. Defined as the distance between their amplitude spectra, ASD can be computed efficiently and provides a parameter-free measure of the global shape dissimilarity of two fragments. ASD inherits from nice theoretical properties, making it tolerant to shifts, insertions, deletions, circular permutations or sequence reversals while satisfying the triangle inequality. The practical interest of ASD with respect to RMSD, RMSDd, BC and TM scores is illustrated through zinc finger retrieval experiments and concrete structure examples. The benefits of ASD are also illustrated by two additional clustering experiments: domain linkers fragments and complementarity-determining regions of antibodies.ConclusionsTaking advantage of the Fourier transform to compare fragments at a global shape level, ASD is an objective and progressive measure taking into account the whole fragments. Its practical computation time and its properties make ASD particularly relevant for applications requiring meaningful measures on distantly related protein fragments, such as similar fragments retrieval asking for high recalls as shown in the experiments, or for any application taking also advantage of triangle inequality, such as fragments clustering.ASD program and source code are freely available at: http://www.irisa.fr/dyliss/public/ASD/.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0693-y) contains supplementary material, which is available to authorized users.

Highlights

  • In structural bioinformatics, there is an increasing interest in identifying and understanding the evolution of local protein structures regarded as key structural or functional protein building blocks

  • Evaluation of the structural similarity of two proteins is an important task in bioinformatics that is mainly performed at three levels: global protein comparison, structural motif comparison and fragment comparison

  • We propose here a novel dissimilarity, named Amplitude Spectrum Distance (ASD), that overcomes these issues by using the Fourier transform to compare the fragments at a global shape level without explicit structure superimposition

Read more

Summary

Introduction

There is an increasing interest in identifying and understanding the evolution of local protein structures regarded as key structural or functional protein building blocks. A central need is to compare these, possibly short, fragments by measuring efficiently and accurately their (dis)similarity Progress towards this goal has given rise to scores enabling to assess the strong similarity of fragments. The classical score used to measure the dissimilarity of two protein structures is the coordinate root-mean-square deviation (RMSD) defined as the minimum average distance between superimposed atoms (usually the Cα) of the proteins by optimal rigid-body rotation and translation. Drawbacks of RMSD are well known: it necessitates computing the optimal superimposition of the atoms, it tends to increase with proteins’ length and it is more sensitive to local than global structural deviations. BC score, RMSD and RMSDd, are computable by tractable exact algorithms They do not rely on expert-chosen parameters, so that they universally apply for protein fragments. The limitation of these scores is that they measure the distance between two ordered sets of residues already aligned one-to-one (the ith residue of the first set is aligned with the ith residue of the second set, typically in the same order than in the fragments’ sequences), making them less suited for the comparison of homologous fragments with mismatches resulting, for instance, from insertions or deletions

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.