Boosting for high-dimensional linear models

Peter Bühlmann

doi:10.3929/ethz-a-004680132

Peter Bühlmann

PDF Available

https://doi.org/10.3929/ethz-a-004680132

Copy DOI

Export

Save

Cite

Journal: Annals of Statistics	Publication Date: Apr 1, 2006
Citations: 153

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

We prove that boosting with the squared error loss, L 2 Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as O(exp(sample size)), assuming that the true underlying regression function is sparse in terms of the 11 -norm of the regression coefficients. In the language of signal processing, this means consistency for de-noising using a strongly overcomplete dictionary if the underlying signal is sparse in terms of the l 1 -norm. We also propose here an AIC-based method for tuning, namely for choosing the number of boosting iterations. This makes L 2 Boosting computationally attractive since it is not required to run the algorithm multiple times for cross-validation as commonly used so far. We demonstrate L 2 Boosting for simulated data, in particular where the predictor dimension is large in comparison to sample size, and for a difficult tumor-classification problem with gene expression microarray data.

Full Text