Abstract

The sparse regression problem, also known as best subset selection problem, can be cast as follows: Given a set S of n points in ℝ^d, a point y∈ ℝ^d, and an integer 2 ≤ k ≤ d, find an affine combination of at most k points of S that is nearest to y. We describe a O(n^{k-1} log^{d-k+2} n)-time randomized (1+e)-approximation algorithm for this problem with d and e constant. This is the first algorithm for this problem running in time o(n^k). Its running time is similar to the query time of a data structure recently proposed by Har-Peled, Indyk, and Mahabadi (ICALP'18), while not requiring any preprocessing. Up to polylogarithmic factors, it matches a conditional lower bound relying on a conjecture about affine degeneracy testing. In the special case where k = d = O(1), we provide a simple O_δ(n^{d-1+δ})-time deterministic exact algorithm, for any δ > 0. Finally, we show how to adapt the approximation algorithm for the sparse linear regression and sparse convex regression problems with the same running time, up to polylogarithmic factors.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.