Abstract

Partial least squares (PLS) performs well for high-dimensional regression problems, where the number of predictors can far exceed the number of observations. Similar to many other supervised learning techniques, PLS was developed in the framework of empirical risk minimization, which typically assumes that the test and training data are drawn from the same distribution. Any violation of this assumption can deteriorate the PLS performance. Subsampling via an influence function is a recently developed and promising technique for addressing this problem. However, influence functions are only guaranteed to be accurate for sufficiently small changes to the model, limiting their application to small-scale datasets. To overcome this obstacle, a new form of the influence function for PLS is derived, and a framework of subsampling via an influence function for PLS is developed. Compared with the classic PLS and two other subsampling frameworks, the results on four simulation datasets and two real-world datasets illustrate the effectiveness of our method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.