Abstract

A version of orthogonal signal correction (OSC), ridge estimated OSC (REOSC) algorithm has been proposed to improve the performance of the OSC. The REOSC preprocesses the original data according to a principle that only the irrelevant variation in OSC components disturbing the PLS modeling model should be removed and the useful information regarding the structure–activity relationship analysis should be retained. The problem of possible removal of useful information in OSC can be overcome in a more or less extent by using a ridge parameter in construction of the estimator in REOSC. Generalized cross-validation method was employed to select the ridge parameter and the number of OSC components. The proposed methodology has been tested in PLS modeling for QSAR studies of cyclooxygenase-2 inhibitors. A comparison in the modeling power of PLS is made between non-processed data set, ordinarily OSC processed data set and that processed by the proposed ridge estimated OSC method. It has been demonstrated that data preprocessing using ridge estimated OSC could improve PLS regression models by reducing both model complexity and prediction errors. The analysis of the cyclooxygenase-2 inhibitor data provides some insight in the effect of molecular structure on the activity of anti-inflammatory agents.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call