Abstract

In recent years, there have been growing concerns about quality evaluation of predictions of developed quantitative structure-activity relationship (QSAR) models. Well-defined applicability domain (AD) is very crucial in the validation of QSAR models as stated in the third principle of Organization for Economic Co-operation and Development (OECD). In this study, a new perspective on defining AD of model based on population analysis (PA) strategy, including model population analysis (MPA) and approach population analysis (APA), was proposed. MPA employed classical AD approaches to define AD with a vast amount of sub-datasets derived from training set. On the basis of MPA, the classical AD approaches could distinguish part of the samples that cannot be distinguished by full training samples. APA was then used to get a union of all results generated by the used AD approaches to give a consensus list of samples as falling outside the AD. In order to investigate the performance of PA strategy in defining AD with the classical AD approaches, two QSAR datasets were used. The results show that implementing PA strategy can assist three classical AD approaches to distinguish the additional samples that cannot be distinguished using full training dataset. When excluding the additional samples, the root mean square error of prediction of test set decreased, suggesting that PA strategy has a potential to distinguish the samples that cannot be reliably predicted.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.