Abstract

In multivariate PLS (partial least square projection to latent structures) and PC (principal component) models with many variables, plots and lists of b loadings, coefficients, VIPs, etc. become messy and results are difficult to interpret. There is then a strong temptation to reduce the variables to a smaller, more manageable number. This reduction of variables, however, often removes information, makes the interpretation misleading and seriously increases the risk of spurious models. A better alternative is often to divide the variables into conceptually meaningful blocks and then apply hierarchical multiblock PLS (or PC) models. This blocking leads to two model levels: the upper level where the relationships between blocks are modelled and the lower level showing the details of each block. On each level, ‘standard’ PLS or PC scores and loading plots are available for model interpretation. This allows an interpretation focused on pertinent blocks and their dominant variables. Such blocking is natural and straightforward in spectroscopy (multivariate calibration), quantitative molecular modelling (e.g. CoMFA) and process modelling. The principles of hierarchical multivariate PLS and PC modelling are reviewed, some problems with variable selection are discussed and the approach is illustrated for a data set with around 300 variables and 500 observations taken from a residue catalytic cracker (RCCU) at the Statoil Mongstad refinery in Norway. © 1996 by John Wiley & Sons, Ltd.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call