Abstract

Predicting the structure of interacting protein chains is a fundamental step towards understanding protein function. Unfortunately, no computational method can produce accurate structures of protein complexes. AlphaFold2, has shown unprecedented levels of accuracy in modelling single chain protein structures. Here, we apply AlphaFold2 for the prediction of heterodimeric protein complexes. We find that the AlphaFold2 protocol together with optimised multiple sequence alignments, generate models with acceptable quality (DockQ ≥ 0.23) for 63% of the dimers. From the predicted interfaces we create a simple function to predict the DockQ score which distinguishes acceptable from incorrect models as well as interacting from non-interacting proteins with state-of-art accuracy. We find that, using the predicted DockQ scores, we can identify 51% of all interacting pairs at 1% FPR.

Highlights

  • Predicting the structure of interacting protein chains is a fundamental step towards understanding protein function

  • Protein docking methodologies refer to how proteins interact and can be divided into two categories considering proteins as rigid bodies; those based on an exhaustive search of the docking space[6] and those based on alignments to structural templates[7]

  • We found that generating the optimal multiple sequence alignment (MSA) is crucial for obtaining accurate Fold and Dock solutions, but this is not always trivial due to the necessity to identify the exact set of interacting protein pairs[26]

Read more

Summary

Results and discussion

The SR, i.e., the percentage of acceptable models (DockQ > 0.23), is used to measure AF2 performance over the development set (216 proteins) using the different MSAs. The best performance is 33.3% for the AF2 MSAs and 39.4% for the AF2+ paired MSAs (Table 1). Results of AF2 run on the development set (n = 216) using different MSAs and neural network configurations. The best outcome using this modelling strategy results in an SR of 57.8% (856 out of 1481 correctly modelled complexes) for the AF2 + paired MSAs compared with 45.0% using the AF2 MSAs alone (Fig. 1, Table 2). The recently developed AF-multimer[28] has the best performance (SR = 72.2%, median = 0.560, Table 2) This method was trained using the same data as the test set, which makes a direct comparison difficult. Different criteria were examined over the test set, including (i) the number of unique interacting residues (Cβ atoms from different chains within 8 Å from each other) in the interface, (ii) the total number of interactions between Cβ atoms in the interface, (iii) the average

Method
Limitations
Methods
Code availability
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call