Abstract

For multivariate nonparametric regression, functional analysis of variance (ANOVA) modeling aims to capture the relationship between a response and covariates by decomposing the unknown function into various components, representing main effects, two-way interactions, etc. Such an approach has been pursued explicitly in smoothing spline ANOVA modeling and implicitly in various greedy methods such as MARS. We develop a new method for functional ANOVA modeling, based on doubly penalized estimation using total-variation and empirical-norm penalties, to achieve sparse selection of component functions and their basis functions. For this purpose, we formulate a new class of hierarchical total variations, which measures total variations at different levels including main effects and multi-way interactions, possibly after some order of differentiation. Furthermore, we derive suitable basis functions for multivariate splines such that the hierarchical total variation can be represented as a regular Lasso penalty, and hence we extend a previous backfitting algorithm to handle doubly penalized estimation for ANOVA modeling. We present extensive numerical experiments on simulations and real data to compare our method with existing methods including MARS, tree boosting, and random forest. The results are very encouraging and demonstrate notable gains from our method in prediction or classification accuracy and simplicity of the fitted functions. Supplementary materials for this article are available online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call