Automatic QSAR modeling of ADME properties: blood–brain barrier penetration and aqueous solubility

Olga Obrezanova,Matthew D Segall,Edmund J Champness,Joelle M R Gola

doi:10.1007/s10822-008-9193-8

Olga Obrezanova, Matthew D Segall + Show 2 more

https://doi.org/10.1007/s10822-008-9193-8

Copy DOI

Abstract

In this article, we present an automatic model generation process for building QSAR models using Gaussian Processes, a powerful machine learning modeling method. We describe the stages of the process that ensure models are built and validated within a rigorous framework: descriptor calculation, splitting data into training, validation and test sets, descriptor filtering, application of modeling techniques and selection of the best model. We apply this automatic process to data sets of blood-brain barrier penetration and aqueous solubility and compare the resulting automatically generated models with 'manually' built models using external test sets. The results demonstrate the effectiveness of the automatic model generation process for two types of data sets commonly encountered in building ADME QSAR models, a small set of in vivo data and a large set of physico-chemical data.

Full Text