Abstract

Abstract A general methodology is proposed for the explanation of variability in a quantity of interest x in terms of covariates z = (z1, …, zL). It provides the conditional mean $\bar{x}(z)$ as a sum of components, where each component is represented as a product of non-parametric one-dimensional functions of each covariate zl that are computed through an alternating projection procedure. Both x and the zl can be real or categorical variables; in addition, some or all values of each zl can be unknown, providing a general framework for multi-clustering, classification and covariate imputation in the presence of confounding factors. The procedure can be considered as a preconditioning step for the more general determination of the full conditional distribution $\boldsymbol{\rho}(x|z) $ through a data-driven optimal-transport barycenter problem. In particular, just iterating the procedure once yields the second order structure (i.e. the covariance) of $\boldsymbol{\rho}(x|z) $. The methodology is illustrated through examples that include the explanation of variability of ground temperature across the continental United States and the prediction of book preference among potential readers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.