Abstract

Discovery of new pharmaceutical substances is currently boosted by the possibility of utilization of the Synthetically Accessible Virtual Inventory (SAVI) library, which includes about 283 million molecules, each annotated with a proposed synthetic one-step route from commercially available starting materials. The SAVI database is well-suited for ligand-based methods of virtual screening to select molecules for experimental testing. In this study, we compare the performance of three approaches for the analysis of structure-activity relationships that differ in their criteria for selecting of “active” and “inactive” compounds included in the training sets. PASS (Prediction of Activity Spectra for Substances), which is based on a modified Naïve Bayes algorithm, was applied since it had been shown to be robust and to provide good predictions of many biological activities based on just the structural formula of a compound even if the information in the training set is incomplete. We used different subsets of kinase inhibitors for this case study because many data are currently available on this important class of drug-like molecules. Based on the subsets of kinase inhibitors extracted from the ChEMBL 20 database we performed the PASS training, and then applied the model to ChEMBL 23 compounds not yet present in ChEMBL 20 to identify novel kinase inhibitors. As one may expect, the best prediction accuracy was obtained if only the experimentally confirmed active and inactive compounds for distinct kinases in the training procedure were used. However, for some kinases, reasonable results were obtained even if we used merged training sets, in which we designated as inactives the compounds not tested against the particular kinase. Thus, depending on the availability of data for a particular biological activity, one may choose the first or the second approach for creating ligand-based computational tools to achieve the best possible results in virtual screening.

Highlights

  • Discovery of novel pharmaceutical agents with improved safety and efficacy is the perpetual task of medicinal chemistry (Pammolli et al, 2011)

  • Available chemical libraries can be divided into four categories: (1) databases containing information about structure and properties of publicly disclosed chemical compounds, e.g., PubChem (Li et al, 2010; Wang Y. et al, 2014) and ChEMBL (Bento et al, 2014); (2) databases containing information about structure of commercially available chemical samples, e.g., ZINC (Sterling and Irwin, 2015); (3) databases of virtually generated structures comprehensively covering the particular chemical space, e.g., GDB-17 (Ruddigkeit et al, 2012); (4) databases of virtually generated, synthetically accessible, structures with data on starting materials and proposed synthetic routes, e.g., SAVI (Synthetically Accessible Virtual Inventory) (Pevzner et al, 2017)

  • We used the resampled results to calculate values of Boltzmann-Enhanced Discrimination of Receiver Operating Characteristic (BEDROC) at different degrees of early recognition of TP. These values were grouped according to the types of sets used for the training, and averaged over the kinases in a manner similar to the way the original results were obtained. Statistical analysis of these data shows that classifiers trained on Individual sets (I-sets) significantly outperform classifiers trained on Merged actives and inactives set (MAI-set) and those, in turn, outperform classifiers trained on Merged actives set (MA-set) (Figure 5) for any α value used in the study

Read more

Summary

Introduction

Discovery of novel pharmaceutical agents with improved safety and efficacy is the perpetual task of medicinal chemistry (Pammolli et al, 2011). In addition to the traditional methods of chemical synthesis and pharmacological studies of various drug-like substances, in recent years substantial attention has been paid to the analysis of the general chemical-biological space (Lipinski and Hopkins, 2004; Baell and Holloway, 2010; Bon and Waldmann, 2010; López-Vallejo et al, 2012; Deng et al, 2013; Medina-Franco et al, 2013; Buonfiglio et al, 2015; Rodriguez-Esteban, 2016; Horvath et al, 2017) Such approaches significantly increase the diversity of the studied chemical libraries as well as the chances to identify the pharmaceutical agents interacting with multiple molecular targets and causing additive or synergistic desired pharmacological action (Sidorov et al, 2015; Lauria et al, 2016). SAVI represents a significant previously unexploited reservoir of novel structures, presumably useful for drug discovery

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.