Oracle inequalities provide probability loss bounds for the lasso estimator at a deterministic choice of the regularization parameter and are commonly cited as theoretical justification for the lasso and its ability to handle high-dimensional settings. Unfortunately, in practice, the regularization parameter is not selected to be a deterministic quantity, but is instead chosen using a random, data-dependent procedure, often making these inequalities misleading in their implications. We discuss general results and demonstrate empirically for data using categorical predictors that the amount of deterioration in performance of the lasso as the number of unnecessary predictors increases can be far worse than the oracle inequalities suggest, but imposing structure on the form of the estimates can reduce this deterioration substantially.
Read full abstract