Abstract

Many learning algorithms are formulated in terms of finding model parameters which minimize a data-fitting loss function plus a regularizer. When the regularizer involves the pseudo-norm, the resulting regularization path consists of a finite set of models. The fastest existing algorithm for computing the breakpoints in the regularization path is quadratic in the number of models, so it scales poorly to high-dimensional problems. We provide new formal proofs that a dynamic programming algorithm can be used to compute the breakpoints in linear time. Our empirical results include analysis of the proposed algorithm in the context of various learning problems (regression, changepoint detection, clustering, and matrix factorization). We use a detailed analysis of changepoint detection problems to demonstrate the improved accuracy and speed of our approach relative to grid search and a previous quadratic time algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call