Abstract

With the accumulation of a large number and variety of molecules in the Protein Data Bank (PDB) comes the need on occasion to review and improve their representation. The Worldwide PDB (wwPDB) partners have periodically updated various aspects of structural data representation to improve the integrity and consistency of the archive. The remediation effort described here was focused on improving the representation of peptide-like inhibitor and antibiotic molecules so that they can be easily identified and analyzed. Peptide-like inhibitors or antibiotics were identified in over 1000 PDB entries, systematically reviewed and represented either as peptides with polymer sequence or as single components. For the majority of the single-component molecules, their peptide-like composition was captured in a new representation, called the subcomponent sequence. A novel concept called “group” was developed for representing complex peptide-like antibiotics and inhibitors that are composed of multiple polymer and nonpolymer components. In addition, a reference dictionary was developed with detailed information about these peptide-like molecules to aid in their annotation, identification and analysis. Based on the experience gained in this remediation, guidelines, procedures, and tools were developed to annotate new depositions containing peptide-like inhibitors and antibiotics accurately and consistently. © 2013 Wiley Periodicals, Inc. Biopolymers 101: 659–668, 2014.

Highlights

  • The Protein Data Bank (PDB) is the single global archive of three-dimensional (3D) structural data of biological macromolecules and their complexes

  • Remediation The first step in remediation was the identification of the peptide-like inhibitor and antibiotic molecules in the PDB archive

  • Over a thousand PDB entries were found to contain peptide-like inhibitors and antibiotics (150 PDB entries with 60 different peptide-like antibiotics and 850 PDB entries with 310 peptide-like inhibitors)

Read more

Summary

Introduction

The Protein Data Bank (PDB) is the single global archive of three-dimensional (3D) structural data of biological macromolecules and their complexes. It is managed by the Worldwide PDB (wwPDB; http://wwpdb.org;[1] a collaborative organization with four partners—the Research Collaboratory for Structural Bioinformatics (RCSB PDB; http://rcsb.org), the PDB in Europe (PDBe; http://pdbe.org), the PDB Japan (PDBj; http://pdbj.org), and the Biological Magnetic Resonance Data Bank (BMRB; http://bmrb.wisc.edu). The partners act as deposition, processing, and distribution centers for PDB data. They collaborate on developing annotation procedures and guidelines, data representation models and formats, and work with community experts to define data quality and validation standards.[2] Occasionally, the wwPDB undertakes large-scale remediation efforts to improve the data representation, consistency, integrity, and usability of the archive. Past archive-wide remediation projects[3,4] have focused on (i) improving the chemical description of the monomer units of the biological polymers and small molecule ligands in the PDB, (ii) standardizing the atom nomenclature to conform to IUPAC recommendations, (iii) updating sequence and taxonomy database references, (iv) improving the representation of viruses, and (v) verifying primary citation assignments

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.