Abstract

A multi-interface domain is a domain that can shape multiple and distinctive binding sites to contact with many other domains, forming a hub in domain-domain interaction networks. The functions played by the multiple interfaces are usually different, but there is no strict bijection between the functions and interfaces as some subsets of the interfaces play the same function. This work applies graph theory and algorithms to discover fingerprints for the multiple interfaces of a domain and to establish associations between the interfaces and functions, based on a huge set of multi-interface proteins from PDB. We found that about 40% of proteins have the multi-interface property, however the involved multi-interface domains account for only a tiny fraction (1.8%) of the total number of domains. The interfaces of these domains are distinguishable in terms of their fingerprints, indicating the functional specificity of the multiple interfaces in a domain. Furthermore, we observed that both cooperative and distinctive structural patterns, which will be useful for protein engineering, exist in the multiple interfaces of a domain.

Highlights

  • A protein domain is usually a contiguous segment in a protein’s primary sequence that can be independently folded to form a stable tertiary structure

  • This domain has thirteen different interfaces playing five molecular functions according to Gene Ontology (GO) annotations [5]

  • We address the following questions: (i) What kind of domains prefer the multi-interface property? That is, we want to know the distribution of domains that have multiple interfaces. (ii) What are the fingerprints of an interface, or a subset of interfaces, in a domain? That is, we want to discover unique structures in a domain that distinguish the multiple interfaces from each other. (iii) What are the relationships between the multiple interfaces in a domain? That is, we want to see whether the multiple interfaces in

Read more

Summary

Introduction

A protein domain is usually a contiguous segment in a protein’s primary sequence that can be independently folded to form a stable tertiary structure. (ii) Adjust position label of interface residues for all the interfaces in each cluster according to the multiple sequence alignment This is necessary as the same chain can be numbered diversely in different entries. Multi-interface proteins are further aggregated into several groups in accordance with SCOP classification, as per the following steps: (i) Align each multiinterface protein sequence to its target domain sequence. Closed frequent subgraphs of one interface cluster, which are deemed as fingerprints of one interface, are mined from an interface graph data base, where the graph database is a set of interface residue contacting graphs in one domain. Paired cooperative graph sets between two different interfaces in a domain are identified using the following steps: (i) Build graph database for each set of interfaces.

Nint family
Domain name
Summary and Discussion
Supporting Information
Findings
Author Contributions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call