Abstract

BackgroundThe RCSB Protein Data Bank (PDB) provides public access to experimentally determined 3D-structures of biological macromolecules (proteins, peptides and nucleic acids). While various tools are available to explore the PDB, options to access the global structural diversity of the entire PDB and to perceive relationships between PDB structures remain very limited.MethodsA 136-dimensional atom pair 3D-fingerprint for proteins (3DP) counting categorized atom pairs at increasing through-space distances was designed to represent the molecular shape of PDB-entries. Nearest neighbor searches examples were reported exemplifying the ability of 3DP-similarity to identify closely related biomolecules from small peptides to enzyme and large multiprotein complexes such as virus particles. The principle component analysis was used to obtain the visualization of PDB in 3DP-space.ResultsThe 3DP property space groups proteins and protein assemblies according to their 3D-shape similarity, yet shows exquisite ability to distinguish between closely related structures. An interactive website called PDB-Explorer is presented featuring a color-coded interactive map of PDB in 3DP-space. Each pixel of the map contains one or more PDB-entries which are directly visualized as ribbon diagrams when the pixel is selected. The PDB-Explorer website allows performing 3DP-nearest neighbor searches of any PDB-entry or of any structure uploaded as protein-type PDB file. All functionalities on the website are implemented in JavaScript in a platform-independent manner and draw data from a server that is updated daily with the latest PDB additions, ensuring complete and up-to-date coverage. The essentially instantaneous 3DP-similarity search with the PDB-Explorer provides results comparable to those of much slower 3D-alignment algorithms, and automatically clusters proteins from the same superfamilies in tight groups.ConclusionA chemical space classification of PDB based on molecular shape was obtained using a new atom-pair 3D-fingerprint for proteins and implemented in a web-based database exploration tool comprising an interactive color-coded map of the PDB chemical space and a nearest neighbor search tool. The PDB-Explorer website is freely available at www.cheminfo.org/pdbexplorer and represents an unprecedented opportunity to interactively visualize and explore the structural diversity of the PDB.ᅟ Graphical abstractᅟMaps of PDB in 3DP-space color-coded by heavy atom count and shape. Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0776-9) contains supplementary material, which is available to authorized users.

Highlights

  • The RCSB Protein Data Bank (PDB) provides public access to experimentally determined 3D-structures of biological macromolecules

  • Similarity values S3DP were calculated from the city-block distance CBD3DP as shown in Eq 4, with X = median city-block distance observed for distribution of City-Block Distance (CBD) computed for 1 M random pairs of molecules within the database: S3DP 1⁄4 X=ðCBD3DP þ XÞ

  • Inspired by the concept of atom-pair fingerprints proposed by Carhart, [30] Sheridan [31] and Schneider [32] to encode pharmacophores in small organic molecules, we recently reported a detailed analysis establishing the suitability of atom-pair fingerprints for 3D-shape and pharmacophore similarity searches in very large databases such as ChEMBL [33] and ZINC [34] using both topological distances read from the 2D-structures [35] and through-space distance read from the 3D-structures [20]

Read more

Summary

Introduction

The RCSB Protein Data Bank (PDB) provides public access to experimentally determined 3D-structures of biological macromolecules (proteins, peptides and nucleic acids). We report a new exploration tool for the PDB called PDB-Explorer which addresses the need for a global perception of the database by giving direct access to all PDB-entries via an interactive color-coded map representing its entire contents in molecular shape space This application follows the principle of our recently reported MQN-mapplet and SMIfp-mapplet applications designed to visualize the chemical space of small organic molecules [15,16,17,18]. The PDB-Explorer provides an unprecedented global view of the PDB allowing a detailed exploration of its entire content in a curiosity-driven manner with or without specific queries This tool is freely available at www.cheminfo.org/pdbexplorer and should greatly facilitate the perception and understanding of the overall diversity of proteins and biological assemblies available in the PDB

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.