The development of robust QSAR models to predict the activity of molecules of β-secretase inhibitors is an area of interest due to the increase of Alzheimer’s disease in patients in the global population. In this paper, we present a proposal based on the use of relative distance matrices as input data to the QSAR algorithms. These matrices store measurements of distances between the structural characteristics of pairs of molecules and between the molecules and a structural pattern extracted from the whole data set, thus efficiently representing a correlation between structural changes and activity. For the building of the classification and regression models support vector machine, tree complex and Gaussian process algorithms have been used; and for the validation of the models cross-validation, bootstrapping and y-randomizing techniques have been applied. The results obtained are close to 100% in accuracy and area under receiver operating characteristic values in classification, and close to 1.0 for r2 and 0.1 for root mean square error in regression in training and in external validation, proving the ‘goodness’ of the proposal.
Read full abstract