In this study, we investigate what types of interactions are specific to their biological function, and what types of interactions are persistent regardless of their functional category in transient protein-protein heterocomplexes. This is the first approach to analyze protein-protein interfaces systematically at the molecular interaction level in the context of protein functions. We perform systematic analysis at the molecular interaction level using classification and feature subset selection technique prevalent in the field of pattern recognition. To represent the physicochemical properties of protein-protein interfaces, we design 18 molecular interaction types using canonical and noncanonical interactions. Then, we construct input vector using the frequency of each interaction type in protein-protein interface. We analyze the 131 interfaces of transient protein-protein heterocomplexes in PDB: 33 protease-inhibitors, 52 antibody-antigens, 46 signaling proteins including 4 cyclin dependent kinase and 26 G-protein. Using kNN classification and feature subset selection technique, we show that there are specific interaction types based on their functional category, and such interaction types are conserved through the common binding mechanism, rather than through the sequence or structure conservation. The extracted interaction types are C(alpha)-- H...O==C interaction, cation...anion interaction, amine...amine interaction, and amine...cation interaction. With these four interaction types, we achieve the classification success rate up to 83.2% with leave-one-out cross-validation at k = 15. Of these four interaction types, C(alpha)--H...O==C shows binding specificity for protease-inhibitor complexes, while cation-anion interaction is predominant in signaling complexes. The amine ... amine and amine...cation interaction give a minor contribution to the classification accuracy. When combined with these two interactions, they increase the accuracy by 3.8%. In the case of antibody-antigen complexes, the sign is somewhat ambiguous. From the evolutionary perspective, while protease-inhibitors and sig-naling proteins have optimized their interfaces to suit their biological functions, antibody-antigen interactions are the happenstance, implying that antibody-antigen complexes do not show distinctive interaction types. Persistent interaction types such as pi...pi, amide-carbonyl, and hydroxyl-carbonyl interaction, are also investigated. Analyzing the structural orientations of the pi...pi stacking interactions, we find that herringbone shape is a major configuration in transient protein-protein interfaces. This result is different from that of protein core, where parallel-displaced configurations are the major configuration. We also analyze overall trend of amide-carbonyl and hydroxyl-carbonyl interactions. It is noticeable that nearly 82% of the interfaces have at least one hydroxyl-carbonyl interactions.
Read full abstract