The accurate calculation of equilibrium constants for protein-protein association is of fundamental importance to quantitative biology and remains an outstanding challenge for computational biophysics. Traditionally, equilibrium constants have been computed from one-dimensional free energy surfaces derived from sampling along a single collective variable. Importantly, recent advances in enhanced sampling methodology have facilitated the characterization of multidimensional free energy landscapes, often exposing multiple thermodynamically important minima missed by more restrictive sampling methods. A key to the effectiveness of this multidimensional sampling approach is the identification of collective variables that effectively define the configurational space of dissociated and associated states. Here we present the application of two machine learning methods for the unbiased determination of collective variables for enhanced sampling and analysis of protein-protein association. Our results both validate prior work, based on intuition derived collective variables, and demonstrate the effectiveness of the machine learning methods for the identification of collective variables for association reactions in complex biomolecular systems.
Read full abstract