Abstract

Despite gaining popularity and success in many modeling applications, Partial Least Squares (PLS) regression continues to provide challenges in the evaluation of important variables. This article describes the relationship between the regression coefficients and orthogonally decomposed variances in PLS. The relation between prediction, model interpretation, and important variable determination is described using the theory of the basic sequence presented here as a special case of the famous Krylov sequence (or the power method).Variable selection methods e.g. Selectivity Ratio (SR) and Variable Importance in the Projection (VIP) are also described in this framework. We show that the interpretation can be affected by unnecessary rotation toward the main source of variance in the X-block. Significance Multivariate Correlation (sMC) is developed using the knowledge obtained from the basic sequence to minimize the effect of irrelevant X-structures. Simultaneously sMC highlights the variables most correlated to the response. The performance of sMC is demonstrated, using simulated and real datasets, against commonly used variable selection methods, such as the Variable Importance in the Projection and Selectivity Ratio.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call