Abstract

A function |$f:{\mathbb{R}}^d \rightarrow {\mathbb{R}}$| is a sparse additive model (SPAM), if it is of the form |$f(\mathbf x) = \sum_{l \in \mathscr{S}}\phi_{l}(x_l)$|⁠, where |$\mathscr{S} \subset [{d}]$|⁠, |${|{{\mathscr{S}}}|} \ll {d}$|⁠. Assuming |$\phi$|’s, |$\mathscr{S}$| to be unknown, there exists extensive work for estimating |$f$| from its samples. In this work, we consider a generalized version of SPAMs that also allows for the presence of a sparse number of second-order interaction terms. For some |${\mathscr{S}_1} \subset [{d}], {\mathscr{S}_2} \subset {[d] \choose 2}$|⁠, with |${|{{{\mathscr{S}_1}}}|} \ll {d}, {|{{{\mathscr{S}_2}}}|} \ll d^2$|⁠, the function |$f$| is now assumed to be of the form: |$\sum_{p \in {\mathscr{S}_1}}\phi_{p} (x_p) + \sum_{(l,l^{\prime}) \in {\mathscr{S}_2}}\phi_{(l,l^{\prime})} (x_{l},x_{l^{\prime}})$|⁠. Assuming we have the freedom to query |$f$| anywhere in its domain, we derive efficient algorithms that provably recover |${\mathscr{S}_1},{\mathscr{S}_2}$| with finite sample bounds. Our analysis covers the noiseless setting where exact samples of |$f$| are obtained and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d. Gaussian noise and arbitrary but bounded noise. Our main methods for identification of |${\mathscr{S}_2}$| essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing-based schemes. Once |${\mathscr{S}_1}, {\mathscr{S}_2}$| are known, we show how the individual components |$\phi_p$|⁠, |$\phi_{(l,l^{\prime})}$| can be estimated via additional queries of |$f$|⁠, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings.

Highlights

  • Many scientific problems involve estimating an unknown function f, defined over a compact subset of Rd, with d large

  • One such assumption leads to the class of sparse additive models (SPAMs) wherein f = l∈S φl for some unknown S ⊂ {1, . . . , d} with |S| = k d

  • We focus on a generalized SPAM model, where f can contain a small number of second order interaction terms, i.e., f (x1, x2, . . . , xd) = φp(xp) +

Read more

Summary

Introduction

Many scientific problems involve estimating an unknown function f , defined over a compact subset of Rd, with d large. There exist algorithms for estimating such f – tailored to the underlying structural assumption – along with attractive theoretical guarantees, that do not suffer from the curse of dimensionality (cf., [16, 10, 51, 19]). One such assumption leads to the class of sparse additive models (SPAMs) wherein f = l∈S φl for some unknown S ⊂ {1, .

Objectives
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.