Abstract

Protein Structure Comparison (PSC) is a well developed field of computational proteomics with active interest from the research community, since it is widely used in structural biology and drug discovery. With new PSC methods continuously emerging and no clear method of choice, Multi-Criteria Protein Structure Comparison (MCPSC) is commonly employed to combine methods and generate consensus structural similarity scores. We present pyMCPSC, a Python based utility we developed to allow users to perform MCPSC efficiently, by exploiting the parallelism afforded by the multi-core CPUs of today’s desktop computers. We show how pyMCPSC facilitates the analysis of similarities in protein domain datasets and how it can be extended to incorporate new PSC methods as they are becoming available. We exemplify the power of pyMCPSC using a case study based on the Proteus_300 dataset. Results generated using pyMCPSC show that MCPSC scores form a reliable basis for identifying the true classification of a domain, as evidenced both by the ROC analysis as well as the Nearest-Neighbor analysis. Structure similarity based “Phylogenetic Trees” representation generated by pyMCPSC provide insight into functional grouping within the dataset of domains. Furthermore, scatter plots generated by pyMCPSC show the existence of strong correlation between protein domains belonging to SCOP Class C and loose correlation between those of SCOP Class D. Such analyses and corresponding visualizations help users quickly gain insights about their datasets. The source code of pyMCPSC is available under the GPLv3.0 license through a GitHub repository (https://github.com/xulesc/pymcpsc).

Highlights

  • Protein Structure Comparison (PSC) allows the transfer of knowledge about known proteins to a novel protein

  • The number of pairwise PSC jobs processed per PSC method is one half of this value because of the symmetry of the PSC scores matrix, the post processing and performance calculations are performed with the full matrix

  • The PDB files, the ground-truth SCOP classification and the pairwise domain list as well as the experimental setup are included in the test folder of the downloadable sources. pyMCPSC generates performance results for three sets of domain pairs, defined as follows:

Read more

Summary

Introduction

Protein Structure Comparison (PSC) allows the transfer of knowledge about known proteins to a novel protein. Novel protein structures are routinely compared against databases of known proteins to establish functional similarities using “guilt by association” [1]. PSC methods are used to identify proteins with similar binding sites all of which become potential targets for the same ligand [5, 6]. All these important applications require the structure of one or more proteins (queries) to be compared against a large number of known protein structures (one-to-all or many-to-many type comparison) to identify protein pairs with high structural similarity

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.