Abstract

ABSTRACTThe building of quantitative structure–activity relationship (QSAR) models for the in silico prediction of volume distribution for drugs at steady-state levels is vital for the selection of potential drugs at the synthesis stage. Using molecular descriptor matrixes, some regression models presenting low accuracy have been proposed, mainly due to the difficulty of compiling an appropriate dataset and the lack of information on dataset representation. In this paper, we use a benchmark dataset of very diverse drugs for the development of predictive models for volume distribution based on the use of relative distance matrixes as the input data to QSAR algorithms. Support vector machine, complex tree, bagged tree and Gaussian process regression algorithms were tested for fingerprint, similarity and relative distance matrixes used as input data, and the results of the built models were compared. Relative distance matrixes generated robust regression models in the training and external validation stages performed using cross-validation, obtaining values for correlation coefficient, bias, slope and root-mean-square error close to the ideal. Relative distance matrixes were also used for the design of classification models, obtaining excellent results with values of accuracy and area under receiver operating characteristic (AUC) close to 100%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call