Abstract

Self-Organizing Molecular Field Analysis (SOMFA) comes with a built-in regression methodology, the Self-Organizing Regression (SOR), instead of relying on external methods such as PLS. In this article we present a proof of the equivalence between SOR and SIMPLS with one principal component. Thus, the modest performance of SOMFA on complex datasets can be primarily attributed to the low performance of the SOMFA regression methodology. A multi-component extension of the original SOR methodology (MCSOR) is introduced, and the performances of SOR, MCSOR and SIMPLS are compared using several datasets. The results indicate that in general the performance of SOMFA models is greatly improved if SOR is replaced with a more sophisticated regression method. The results obtained for the Cramer (CBG) dataset further underline the fact that it is a very poor benchmark dataset and should not be used to evaluate the performance of QSAR techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call