The generalized degrees of freedom of multilinear principal component analysis

I-Ping Tu,Su-Yun Huang,Dai-Ni Hsieh

doi:10.1016/j.jmva.2019.01.010

Abstract

Tensor data, such as image set, movie data, gene-environment interactions, or gene–gene interactions, have become a popular data format in many fields. Multilinear Principal Component Analysis (MPCA) has been recognized as an efficient dimension reduction method for tensor data analysis. However, a gratifying rank selection method for a general application of MPCA is not yet available. For example, both the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), arguably two of the most commonly used model selection methods, require more strict model assumptions when applying on the rank selection in MPCA. In this paper, we propose a rank selection rule for MPCA based on the minimum risk criterion and Stein’s unbiased risk estimate (SURE). We derive a neat formula while using the minimum model assumptions for MPCA. It is composed of a residual sum of squares for model fitting and a penalty on the model complexity referred as the generalized degrees of freedom (GDF). We allocate each term in the GDF to either the number of parameters used in the model or the complexity in separating the signal from the noise. Compared with AIC and BIC and their modification methods, this criterion reaches higher accuracies in a thorough simulation study. Importantly, it has potential for more general application because it makes fewer model assumptions.

Full Text