Abstract

Tautomerism is an important aspect associated with a variety of pharmacologically and biologically active compounds. It is a challenge to account for tautomerism in computer-aided drug designing (CADD). The estimations and calculations of many physico-chemical properties and theoretical descriptors of the molecules are sensitive to tautomerism. In this study, we have attempted to analyze the effect of tautomerism on feature selection and statistical performance/characteristics of conventional quantitative structure–activity relationship (QSAR) equations. These equations are developed using 2D and 3D-descriptors employing two different statistical methods, i.e., genetic algorithm (GA) and stepwise regression (SR). Five datasets of moderate sizes viz. (1) anti-malarial activity of synthetic prodiginines against multi-drug resistant strain (N = 43), (2) anti-malarial activity of bisaryl quinolones (N = 37), (3) anti-malarial activity of phosphoramidate and phosphorothioamidate analogs of amiprophos methyl (N = 36), (4) anti-proliferative activity of substituted N-phenyl ureidobenzenesulfonate derivatives (N = 44), and (5) anti-HIV activity of indolylarylsulfones as HIV-1 non-nucleoside reverse transcriptase inhibitors (N = 36) showing different types of tautomerism were used in the study. In each case, the developed model and the selected descriptors derived using one tautomer were applied on other tautomeric forms to understand the influence of tautomerism on QSAR equations. Different parameters like R, R 2, R adj 2 , R cv 2 , F, S and Y-randomization were used for thorough validation of all the models. The results revealed that tautomerism has significant influence on feature selection. In addition, it was found that tautomerism has a great influence on the performance of QSAR models of the second and the third datasets. However, no significant influence was observed on the statistical characteristics of QSAR models for datasets 1, 4, and 5. Therefore, it is suggested that separate models need to be developed for different tautomeric forms of a dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call