Abstract

Quantitative structure–activity relationship (QSAR) studies have been carried out on indolyl aryl sulfones, a class of novel HIV-1 non-nucleoside reverse transcriptase inhibitors, using physicochemical, topological and structural parameters along with appropriate indicator variables. The statistical tools used were linear methods (e.g., stepwise regression analysis, partial least squares (PLS), factor analysis followed by multiple regression (FA-MLR), genetic function approximation combined with multiple linear regression (GFA-MLR) and GFA followed by PLS or G/PLS and nonlinear method (artificial neural network or ANN). In case of physicochemical parameters, GFA-MLR generated the best Equation (n = 97, R2 = 0.862, Q2 = 0.821). Using topological parameters, the best Equation (based on leave-one-out Q2) was obtained with stepwise regression technique (n = 97, R2 = 0.867, Q2 = 0.811). When topological and physicochemical parameters were used in combination, statistical quality increased to a great extent (n = 97, R2 = 0.891, Q2 = 0.849 from stepwise regression). Furthermore, the whole dataset had been divided into test (25% of whole dataset) and training (remaining 75%) sets. Models were developed based on the training set and predictive potential of such models was checked from the test set. The selection of the training set was based on K-means clustering of the standardized descriptors (topological and physicochemical). In this case also the best results were obtained with stepwise regression (n = 72, R2 = 0.906, Q2 = 0.853) but external predictive capacity of this model () was inferior to the model developed from GFA-MLR technique (R2 = 0.883, Q2 = 0.823, ). However, the squared regression coefficient between observed activity and predicted activity values of the test set compounds for the best linear model, i.e., GFA-MLR (r2 = 0.736) was lower in comparison to the best nonlinear model developed using artificial neural network (r2 = 0.781). Thus, based on external validation, the ANN models were superior to the linear models. The predictive potential of the best linear Equation (stepwise regression model) was superior to that of the previously published CoMFA (Q2 = 0.81, SDEPTest = 0.89) on the same data set (Ragno R. et al., J Med Chem 2006, 49, 3172–3184). Furthermore, the physicochemical parameter based models also supported the previous observations based on docking (Ragno R. et al., J Med Chem 2005, 48, 213–223).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call