Abstract

A Quantitative Structure-Activity Relationship (QSAR) approach for classification was used for the prediction of compounds as active/inactive relatively to overall biological activity, antitumor and antibiotic activities using a data set of 1746 compounds from PubChem with empirical CDK descriptors and semi-empirical quantum-chemical descriptors. A data set of 183 active pharmaceutical ingredients was additionally used for the external validation of the best models. The best classification models for antibiotic and antitumor activities were used to screen a data set of marine and microbial natural products from the AntiMarin database—25 and four lead compounds for antibiotic and antitumor drug design were proposed, respectively. The present work enables the presentation of a new set of possible lead like bioactive compounds and corroborates the results of our previous investigations. By other side it is shown the usefulness of quantum-chemical descriptors in the discrimination of biologically active and inactive compounds. None of the compounds suggested by our approach have assigned non-antibiotic and non-antitumor activities in the AntiMarin database and almost all were lately reported as being active in the literature.

Highlights

  • Natural products (NPs), or synthetic products inspired by NPs, have been the single most productive source leads for the development of drugs

  • The present study focuses on the application of machine learning (ML) techniques to exploit lead-like molecules en route to antitumor and antibiotic drugs from 418 MNPs and MbNPs

  • In the AntiMarin set, there are 87 compounds with these specifications. From those only 68 compounds are active as compared with approximately 96% of active compounds from the test set I compounds (PubChem) with the same specifications. From those with the same specifications 59 and 14 compounds were predicted as true positives (TPs) and false positives (FPs) with an Avg.Probactive of 0.73 and 0.70, respectively, using the best Random Forests (Rfs) model for the AntiMarin set

Read more

Summary

Introduction

Natural products (NPs), or synthetic products inspired by NPs, have been the single most productive source leads for the development of drugs. In the last years few computational approaches have been applied for in silico screening of NPs [8,9,10,11,12,13,14,15,16] This is a field that can be significantly improved by the modeling of data from large databases containing information relatively to biological activities, which are becoming available to the scientific community. In the context of QSAR studies quantum-chemical descriptors, e.g., net atomic charges, HOMO and LUMO energies, hardness, chemical potential, electrophilicity index, have been shown to be useful in the estimation of various biological activities, for example in studies related with estimation of toxicity, mutagenicity and carcinogenicity as well as in studies of their mechanisms of action [20,21,22].

Establishment of QSAR Classification Models
Overall Biological Activity Model
Antitumor Activity Model
Antibiotic Activity Model
Data Sets and Descriptors
Molecular Descriptors
Selection of Descriptors and Optimization of QSAR Classification Methods
ML Techniques
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call