Abstract

Principal Components Regression (PCR) is a traditional tool for dimension reduction in linear regression that has been both criticized and defended. One concern about PCR is that obtaining the leading principal components tends to be computationally demanding for large data sets. While random projections do not possess the optimality properties of the leading principal subspace, they are computationally appealing and hence have become increasingly popular in recent years. In this paper, we present an analysis showing that for random projections satisfying a Johnson-Lindenstrauss embedding property, the prediction error in subsequent regression is close to that of PCR, at the expense of requiring a slightly large number of random projections than principal components. Column sub-sampling constitutes an even cheaper way of randomized dimension reduction outside the class of Johnson-Lindenstrauss transforms. We provide numerical results based on synthetic and real data as well as basic theory revealing differences and commonalities in terms of statistical performance.

Highlights

  • Principal Components Regression (PCR), first introduced in [24, 32], is perhaps the most basic approach to dimension reduction in linear regression

  • The use of PCR is debated in the literature as it does not need to be case that principal components corresponding to small singular values do not significantly contribute in predicting the response variable [6, 29]

  • There are many more papers (e.g., [2, 23, 39, 40, 47, 49]) on the scenario in which X is reduced to RX, i.e., R is multiplied from left instead of from the right with X being reduced to XR

Read more

Summary

Introduction

Principal Components Regression (PCR), first introduced in [24, 32], is perhaps the most basic approach to dimension reduction in linear regression. In PCR, the design matrix X ∈ Rn×d containing the original predictor variables is replaced by XR ∈ Rn×r, r < d ∧ n, where R ∈ Rr×d reduces X to its top r principal components. We derive bounds on the prediction error of PCR and put them into relation to existing results on CLS. The optimal linear predictor Xw∗ of y given X with respect to squared loss is defined by the optimization problem min E[

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call