Abstract

Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.

Highlights

  • Most proteins interact with other proteins and form protein complexes to carry out their function [1]

  • The first definition, which is the most lenient, is based solely on the topology of the graphs. This means that any two complexes with the same number of chains and the same pattern of contacts belong to the same group, even if their chains are structurally unrelated. We use this definition to create the groups that form the top level of the classification, and we call these groups Quaternary Structure Topologies (QS Topologies, or Quaternary Structure Topology (QST)), and we find 192 of them in the current dataset

  • We have presented a novel method to describe and compare structures of proteins complexes, which we used to derive a hierarchical classification system. This hierarchical classification allows us to answer to the question, ‘‘How many different complexes exist in the Protein Data Bank (PDB)?’’ Depending on the level of detail, we find from 192 structures at the top level to 12,231 structures at the bottom level of the hierarchy

Read more

Summary

Introduction

Most proteins interact with other proteins and form protein complexes to carry out their function [1]. A recent survey of ;2,000 yeast proteins found that more than 80% of the proteins interact with at least one partner [2]. This reflects the importance of protein interactions within a cell. The Protein Data Bank (PDB) [3] makes available a large number of structures that effectively provide a molecular snapshot of proteins and their interactions, at a much greater level of detail than other experimental methods. Since half of the crystallographic structures are homo- or heteromeric protein complexes, crystallographic data represent an important source of information to study the molecular bases of protein– protein interactions, and more generally of protein complex formation

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call