A new data representation based on relative measurements and fingerprint patterns for the development of QSAR regression models

Irene Luque Ruiz,Miguel Ángel Gómez Nieto

doi:10.1016/j.chemolab.2018.03.007

Irene Luque Ruiz, Miguel Ángel Gómez Nieto

https://doi.org/10.1016/j.chemolab.2018.03.007

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Relative distance matrixes represent measurements of the structural characteristics of the molecules, having into account a reference pattern common to the whole data set considered in the development of QSAR regression models. These matrixes store relationships between the data set molecules, measuring the transformation cost between pairs of molecules and a pattern from the common fragments to the entire data set. These measurements are quite related with the activity value changes and, therefore, its use allows the building of robust QSAR regression models. In this paper, we describe the building of relative distance matrixes for the representation of two data sets with clearly different characteristics and previously used as benchmark. Applying Support Vector machine algorithms, several training models and external validation were carried out randomly selecting both sets. The results obtained with correlation coefficient greater than 0.9, low values of error and values of slope and bias close to the ideality have shown the goodness of the presented proposal, clearly improving the results obtained in the literature.

Full Text