ABSTRACTWe develop a new method for multivariate scalar on multidimensional distribution regression. Traditional approaches typically analyze isolated univariate scalar outcomes or consider unidimensional distributional representations as predictors. However, these approaches are suboptimal because (i) they fail to utilize the dependence between the distributional predictors and (ii) neglect the correlation structure of the response. To overcome these limitations, we propose a multivariate distributional analysis framework that harnesses the power of multivariate density functions and multitask learning. We develop a computationally efficient semiparametric estimation method for modeling the effect of the latent joint density on the multivariate response of interest. Additionally, we introduce a new conformal prediction algorithm for quantifying the uncertainty of our multivariate predictions based on subject characteristics and individualized distributional predictors, providing valuable insights into the conditional distribution of the response. We validate the effectiveness of our proposed method through comprehensive numerical simulations, clearly demonstrating its superior performance compared to traditional methods. The application of the proposed method is demonstrated on triaxial accelerometer data from the National Health and Nutrition Examination Survey 2011–2014 for modeling the association between cognitive scores across various domains and distributional representation of physical activity among the older adult population. Our results highlight the advantages of the proposed approach, emphasizing the significance of incorporating multidimensional distributional information in the triaxial accelerometer data.
Read full abstract