Computational protein design efforts continue to make remarkable advances, yet the discovery of high-affinity binders typically requires large-scale experimental screening of site-saturated mutant (SSM) libraries. Here, we explore how massively parallel free energy methods can be used for in silico affinity maturation of de novo designed binding proteins. Using an expanded ensemble (EE) approach, we perform exhaustive relative binding free energy calculations for SSM variants of three miniproteins designed to bind influenza A H1 hemagglutinin by Chevalier et al. (2017). We compare our predictions to experimental ΔΔ G values inferred from a Bayesian analysis of the high-throughput sequencing data, and to state-of-the-art predictions made using the Flex ddG Rosetta protocol. A systematic comparison reveals prediction accuracies around 2 kcal/mol, and identifies net charge changes, large numbers of alchemical atoms, and slow side chain conformational dynamics as key contributors to the uncertainty of the EE predictions. Flex ddG predictions are more accurate on average, but highly conservative. In contrast, EE predictions can better classify stabilizing and destabilizing mutations. We also explored the ability of SSM scans to rationalize known affinity-matured variants containing multiple mutations, which are non-additive due to epistatic effects. Simple electrostatic models fail to explain non-additivity, but observed mutations are found at positions with higher Shannon entropies. Overall, this work suggests that simulation-based free energy methods can provide predictive information for in silico affinity maturation of designed miniproteins, with many feasible improvements to the efficiency and accuracy within reach.
Read full abstract