Abstract

Protein interaction networks provide an increasingly complex picture of the relationships between macromolecules in the cell. Complementing these interactions with structural data provides critical insights into interaction mechanisms. However, structural information is available only for a tiny fraction of protein interactions and complexes currently known. To address this gap, we have developed a method to predict macromolecular complex structures by systematic combination of pairwise interactions of known structure. We first identify all interactions within a network that are of known structure or sufficiently similar to known structure to permit homology modelling. We then use these structural constraints to construct models of complexes. We tackle combinatorial explosion by developing an efficient algorithm that exploits heuristics to reduce the large search space and complement this with an automated scoring system to filter out the exponentially large number of unrealistic complexes, leaving a ranked set of the most plausible models. To test the approach, we defined a benchmark set of complexes of known structure, and show that many complexes can be re-created with good accuracy, using templates below 75% sequence identity. Certain models are much larger and more complete than what is capable with traditional modelling techniques. The approach can identify the most plausible homology models for a complex of dozens of proteins in less than a few hours. We applied the approach to whole-proteome sets of complexes from S. cerevisiae. For the complexes of known structure, we are able to identify the native complex in the majority of cases. We provide promising models for several dozen additional complexes, including multiple isoforms for each. Modelled complexes also provide functional classification, particularly for unannotated complexes from structural genomics initiatives. We show that the best results are achieved when the stoichiometry of the components is known and when the modelling is approached hierarchically, where core components, representing high-confidence interactions, are modelled before non-obligate interactions. We are refining this aspect of the automated modelling and making the procedure publicly available via a web service, to aid in the analysis of models. As the rate of structurally resolved interactions grows, our ability to model larger and more diverse complexes will grow exponentially.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call