Abstract

In this paper, we propose a new definition of models applicability domain (AD) based on the selection of sufficient portion of individual QSPR models to be accepted for property prediction. Efficiency of this approach has been demonstrated in ensemble modeling of the stability constants logK of the 1:1 complexes of 17 lanthanide and transition metal ions (M) with various organic ligands (L) in water. The individual linear models based on substructural molecular fragment (SMF) descriptors were validated in 5-fold cross-validation procedure. Each test set compound was a subject of two previously developed ADs: fragment control and bounding box. After that, predictions for a given compound were considered reliable if the number of accepted models were larger than user defined portion of the total amount of selected individual models; otherwise the compound was discarded from the modeling. Application of this rule “Quorum Control” – resulted in significant increase predictive performance of consensus models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call