Abstract

Activity cliffs (ACs) are formed by two structurally similar compounds with a large difference in potency. Accurate AC prediction is expected to help researchers’ decisions in the early stages of drug discovery. Previously, predictive models based on matched molecular pair (MMP) cliffs have been proposed. However, the proposed methods face a challenge of interpretability due to the black-box character of the predictive models. In this study, we developed interpretable MMP fingerprints and modified a model-specific interpretation approach for models based on a support vector machine (SVM) and MMP kernel. We compared important features highlighted by this SVM-based interpretation approach and the SHapley Additive exPlanations (SHAP) as a major model-independent approach. The model-specific approach could capture the difference between AC and non-AC, while SHAP assigned high weights to the features not present in the test instances. For specific MMPs, the feature weights mapped by the SVM-based interpretation method were in agreement with the previously confirmed binding knowledge from X-ray co-crystal structures, indicating that this method is able to interpret the AC prediction model in a chemically intuitive manner.

Highlights

  • Activity cliffs (ACs) [1] are formed by two structurally similar compounds with a large difference in potency

  • In order to represent an matched molecular pair (MMP) kernel value as the sum of feature contributions, we propose to decompose the cross-term into half and assign them to a core feature and a substituent feature so that the sum of the linear contributions of all the features equals to the support vector machine (SVM) output

  • We developed interpretable MMP fingerprints and modified a model-specific approach to interpret a well-constructed AC prediction model using the SVM and MMP kernel

Read more

Summary

Introduction

Activity cliffs (ACs) [1] are formed by two structurally similar compounds with a large difference in potency. ACs can be found in hit-to-lead or lead optimization phases in which structurally analogous compounds are examined to obtain compounds with the desired potency or properties such as absorption, distribution, metabolism, excretion, and toxicity. A useful tool to represent the similarity between compounds with small chemical modifications is the matched molecular pair (MMP) [2]. An MMP is composed of two structurally similar compounds that share a common substructure (core) and differ at a single site (substituents). MMP is helpful to link the potency change and single chemical modification, and to systematically identify ACs as MMP-cliffs [3]. An MMP-cliff is defined as an MMP with a significant difference in potency (generally >2 log units)

Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call