Explaining the predictions of machine learning models is of critical importance for integrating predictive modeling in drug discovery projects. We have generated a test system for predicting isoform selectivity of phosphoinositide 3-kinase (PI3K) inhibitors and systematically analyzed correct predictions of selective inhibitors using a new methodology termed MolAnchor, which is based on the "anchors" concept from explainable artificial intelligence. The approach is designed to generate chemically intuitive explanations of compound predictions. For nearly all correctly predicted isoform-selective inhibitors, well-defined structural fragments determining the predictions were identified, and in most cases, an individual substructure was responsible for the prediction outcome. For inhibitors with different isoform selectivities, recurrent substructures determining the predictions were distinct. The comparison of newly identified anchor substructures with independent explanations based on calculated feature importance values supported the superior interpretability of MolAnchor explanations. Two highly recurrent substructures determining correct predictions were found to be directly implicated in isoform selectivity of PI3K inhibitors, thus indicating a causal relationship between decisive substructures and selectivity determinants.
Read full abstract