Abstract

BackgroundStructural alignment of proteins is one of the most challenging problems in molecular biology. The tertiary structure of a protein strictly correlates with its function and computationally predicted structures are nowadays a main premise for understanding the latter. However, computationally derived 3D models often exhibit deviations from the native structure. A way to confirm a model is a comparison with other structures. The structural alignment of a pair of proteins can be defined with the use of a concept of protein descriptors. The protein descriptors are local substructures of protein molecules, which allow us to divide the original problem into a set of subproblems and, consequently, to propose a more efficient algorithmic solution. In the literature, one can find many applications of the descriptors concept that prove its usefulness for insight into protein 3D structures, but the proposed approaches are presented rather from the biological perspective than from the computational or algorithmic point of view. Efficient algorithms for identification and structural comparison of descriptors can become crucial components of methods for structural quality assessment as well as tertiary structure prediction.ResultsIn this paper, we propose a new combinatorial model and new polynomial-time algorithms for the structural alignment of descriptors. The model is based on the maximum-size assignment problem, which we define here and prove that it can be solved in polynomial time. We demonstrate suitability of this approach by comparison with an exact backtracking algorithm. Besides a simplification coming from the combinatorial modeling, both on the conceptual and complexity level, we gain with this approach high quality of obtained results, in terms of 3D alignment accuracy and processing efficiency.ConclusionsAll the proposed algorithms were developed and integrated in a computationally efficient tool descs-standalone, which allows the user to identify and structurally compare descriptors of biological molecules, such as proteins and RNAs. Both PDB (Protein Data Bank) and mmCIF (macromolecular Crystallographic Information File) formats are supported. The proposed tool is available as an open source project stored on GitHub (https://github.com/mantczak/descs-standalone).Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1237-9) contains supplementary material, which is available to authorized users.

Highlights

  • Structural alignment of proteins is one of the most challenging problems in molecular biology

  • The quality assessment of biological molecules can be performed on the following levels: (1) a global perspective, where for every structural model a single score is computed representing the quality of the whole 3D model, and (2) a local perspective, where a structural reliability score is computed for a local neighborhood of each model residue

  • All the algorithms described in the previous section were developed in Java and integrated in a computationally efficient tool descs-standalone, which allows a user to identify and structurally compare descriptors of biological molecules, such as proteins and Ribonucleic acid (RNA)

Read more

Summary

Introduction

Structural alignment of proteins is one of the most challenging problems in molecular biology. Efficient algorithms for identification and structural comparison of descriptors can become crucial components of methods for structural quality assessment as well as tertiary structure prediction. Sequencing of genomes of living organisms, that is discovering their linear structure (sequence of nucleotides) is nowadays a fundamental way of acquiring biological data. Such data are synthesized and analyzed by using computer science tools and methods [1]. Derived protein 3D models exhibit deviations from the corresponding reference structures. Structural quality of a predicted model in comparison with an experimentally derived reference structure can be assessed [6] or computed using a general purpose method for structural comparison of proteins [7]. When the reference structure is not known, the assessment process is much more difficult

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.