Abstract Accurately estimating protein-ligand binding free energy is crucial for drug design and biophysics, yet remains a challenging task. In this study, we applied the screening molecular mechanics/Poisson Boltzmann surface area (MM/PBSA) method in combination with various machine learning techniques to compute the binding free energies of protein-ligand interactions. Our results demonstrate that machine learning outperforms direct screening MM/PBSA calculations in predicting protein-ligand binding free energies. Notably, the random forest (RF) method exhibited the best predictive performance, with a Pearson correlation coefficient (r p ) of 0.702 and a mean absolute error (MAE) of 1.379 kcal/mol. Furthermore, we analyzed feature importance rankings in the gradient boosting (GB), adaptive boosting (AdaBoost), and RF methods, and found that feature selection significantly impacted predictive performance. In particular, molecular weight (MW) and van der Waals (VDW) energies played a decisive role in the prediction. Overall, this study highlights the potential of combining machine learning methods with screening MM/PBSA for accurately predicting binding free energies in biosystems.
Read full abstract