Abstract

Numerous studies have been published in recent years with acceptable quantitative structure-activity relationship (QSAR) modeling based on heterogeneous data. In many cases, the training sets for QSAR modeling were constructed from compounds tested by different biological assays, contradicting the opinion that QSAR modeling should be based on the data measured by a single protocol. We attempted to develop approaches that help to determine how heterogeneous data should be used for the creation of QSAR models on the basis of different sets of compounds tested by different experimental methods for the same target and the same endpoint. To this end, more than 100 QSAR models for the IC50 values of ligands interacting with cyclooxygenase 1,2 (COX) and seed lipoxygenase (LOX), obtained from ChEMBL database were created using the GUSAR software. The QSAR models were tested on the external set, including 26 new thiazolidinone derivatives, which were experimentally tested for COX-1,2/LOX inhibition. The IC50 values of the derivatives varied from 89 μM to 26 μM for LOX, from 200 μM to 0.018 μM for COX-1, and from 210 μM to 1 μM for COX-2. This study showed that the accuracy of the models is dependent on the distribution of IC50 values of low activity compounds in the training sets. In the most cases, QSAR models created based on the combined training sets had advantages in comparison with QSAR models, based on a single publication. We introduced a new method of combination of quantitative data from different experimental studies based on the data of reference compounds, which was called "scaling".

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call