Abstract

In order to improve the calibration speed for very large data sets, novel algorithms for principal component regression (PCR) and partial-least-squares (PLS) regression are presented. They use the Lanczos or PLS-1 transformation to reduce the data matrix X to a small bidiagonal matrix ( R), after which the small tridiagonal matrix ( R′ R) is diagonalized and inverted. The complexity of the PCR model may be optimized by cross-validation (PCRL) but also using simpler and faster recipes based upon round-off monitoring and model fit (PCRF). A similar fast PLS procedure (PLSF) is also presented. Calculations are made for five near infrared spectroscopy (NIR) data sets and compared with PCR with feature selection (PCRS) based on correlation and with de Jong's simple partial least squares (SIMPLS). The Lanczos-based methods have comparable prediction performance and similar model complexity to PCRS and SIMPLS but are considerably faster. From a detailed comparison of the methods, some insight is gained into the performance of the PLS method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.