Abstract

Discovery and optimization of small molecule inhibitors as therapeutic drugs have immensely benefited from rational structure-based drug design. With recent advances in high-resolution structure determination, computational power, and machine learning methodology, it is becoming more tractable to elucidate the structural basis of drug potency. However, the applicability of machine learning models to drug design is limited by the interpretability of the resulting models in terms of feature importance. Here, we take advantage of the large number of available inhibitor-bound HIV-1 protease structures and associated potencies to evaluate inhibitor diversity and machine learning models to predict ligand affinity. First, using a hierarchical clustering approach, we grouped HIV-1 protease inhibitors and identified distinct core structures. Explicit features including protein-ligand interactions were extracted from high-resolution cocrystal structures as 3D-based fingerprints. We found that a gradient boosting machine learning model with this explicit feature attribution can predict binding affinity with high accuracy. Finally, Shapley values were derived to explain local feature importance. We found specific van der Waals (vdW) interactions of key protein residues are pivotal for the predicted potency. Protein-specific and interpretable prediction models can guide the optimization of many small molecule drugs for improved potency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call