Abstract

Numerous ligand-based drug discovery projects are based on structure-activity relationship (SAR) analysis, such as Free-Wilson (FW) or matched molecular pair (MMP) analysis. Intrinsically they assume linearity and additivity of substituent contributions. These techniques are challenged by nonadditivity (NA) in protein–ligand binding where the change of two functional groups in one molecule results in much higher or lower activity than expected from the respective single changes. Identifying nonlinear cases and possible underlying explanations is crucial for a drug design project since it might influence which lead to follow. By systematically analyzing all AstraZeneca (AZ) inhouse compound data and publicly available ChEMBL25 bioactivity data, we show significant NA events in almost every second assay among the inhouse and once in every third assay in public data sets. Furthermore, 9.4% of all compounds of the AZ database and 5.1% from public sources display significant additivity shifts indicating important SAR features or fundamental measurement errors. Using NA data in combination with machine learning showed that nonadditive data is challenging to predict and even the addition of nonadditive data into training did not result in an increase in predictivity. Overall, NA analysis should be applied on a regular basis in many areas of computational chemistry and can further improve rational drug design.

Highlights

  • The similarity and additivity principles represent the basis of various well-established areas in computeraided drug design (CADD) such as Free-Wilson (FW) [1] analysis, two-dimensional (2D)/three-dimensional (3D) quantitative structure-activity relationship (QSAR) [2], matched molecular pair (MMP) [3] analysis, and computational scoring functions [4, 5]

  • NA calculations are based on MMP analysis (upon the assembly of double-transformation cycles (DTC)), using an open-source code developed by Dalke et al, [56] which is an implementation of the Matched molecular pair analysis (MMPA) algorithm by Hussain and Rea [3]

  • Considering our data and the studies carried out by Kramer et al regarding experimental uncertainty of public and inhouse data sets [34,35,36] 0.3 and 0.5 log units were used as thresholds for AZ and ChEMBL data respectively

Read more

Summary

Introduction

The similarity and additivity principles represent the basis of various well-established areas in computeraided drug design (CADD) such as Free-Wilson (FW) [1] analysis, two-dimensional (2D)/three-dimensional (3D) quantitative structure-activity relationship (QSAR) [2], matched molecular pair (MMP) [3] analysis, and computational scoring functions [4, 5]. Similarity and additivity are often implicitly assumed in CADD approaches in order to identify favorable molecular descriptors and predict the activity of new molecules. Exceptions to linearity and additivity occur when the combination of substituents significantly boosts or decreases the biological activity of a ligand [15,16,17,18,19]. Conformational changes in the binding pocket such as complete reorientation of the ligands alter the free energy of binding [15]. Many nonadditive ‘magic methyl’ cases [13, 14, 22], i.e. attaching a simple alkyl fragment

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call