Abstract

Variable selection k nearest neighbor QSAR modeling approach was applied to a data set of 80 3-arylisoquinolines exhibiting cytotoxicity against human lung tumor cell line (A-549). All compounds were characterized with molecular topology descriptors calculated with the MolconnZ program. Seven compounds were randomly selected from the original dataset and used as an external validation set. The remaining subset of 73 compounds was divided into multiple training (56 to 61 compounds) and test (17 to 12 compounds) sets using a chemical diversity sampling method developed in this group. Highly predictive models characterized by the leave-one out cross-validated <TEX>$R^2$</TEX> (<TEX>$q^2$</TEX>) values greater than 0.8 for the training sets and <TEX>$R^2$</TEX> values greater than 0.7 for the test sets have been obtained. The robustness of models was confirmed by the Y-randomization test: all models built using training sets with randomly shuffled activities were characterized by low <TEX>$q^2{\leq}0.26$</TEX> and <TEX>$R^2{\leq}0.22$</TEX> for training and test sets, respectively. Twelve best models (with the highest values of both <TEX>$q^2$</TEX> and <TEX>$R^2$</TEX>) predicted the activities of the external validation set of seven compounds with <TEX>$R^2$</TEX> ranging from 0.71 to 0.93.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.