Abstract

We have developed a drug discovery strategy that employs variable selection quantitative structure-activity relationship (QSAR) models for chemical database mining. The approach starts with the development of rigorously validated QSAR models obtained with the variable selection k nearest neighbor (kNN) method (or, in principle, with any other robust model-building technique). Model validation is based on several statistical criteria, including the randomization of the target property (Y-randomization), independent assessment of the training set model's predictive power using external test sets, and the establishment of the model's applicability domain. All successful models are employed in database mining concurrently; in each case, only variables selected as a result of model building (termed descriptor pharmacophore) are used in chemical similarity searches comparing active compounds of the training set (queries) with those in chemical databases. Specific biological activity (characteristic of the training set compounds) of external database entries found to be within a predefined similarity threshold of the training set molecules is predicted on the basis of the validated QSAR models using the applicability domain criteria. Compounds judged to have high predicted activities by all or the majority of all models are considered as consensus hits. We report on the application of this computational strategy for the first time for the discovery of anticonvulsant agents in the Maybridge and National Cancer Institute (NCI) databases containing ca. 250,000 compounds combined. Forty-eight anticonvulsant agents of the functionalized amino acid (FAA) series were used to build kNN variable selection QSAR models. The 10 best models were applied to mining chemical databases, and 22 compounds were selected as consensus hits. Nine compounds were synthesized and tested at the NIH Epilepsy Branch, Rockville, MD using the same biological test that was employed to assess the anticonvulsant activity of the training set compounds; of these nine, four were exact database hits and five were derived from the hits by minor chemical modifications. Seven of these nine compounds were confirmed to be active, indicating an exceptionally high hit rate. The approach described in this report can be used as a general rational drug discovery tool.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.