The reduced-rank regression model is a popular model to deal with multivariate response and multiple predictors, and is widely used in biology, chemometrics, econometrics, engineering, and other fields. In the reduced-rank regression modeling, a central objective is to estimate the rank of the coefficient matrix that represents the number of effective latent factors in predicting the multivariate response. Although theoretical results such as rank estimation consistency have been established for various methods, in practice rank determination still relies on information criterion based methods such as AIC and BIC or subsampling based methods such as cross-validation. Unfortunately, the theoretical properties of these practical methods are largely unknown. In this article, we present a novel method called StARS-RRR that selects the tuning parameter and then estimates the rank of the coefficient matrix for reduced-rank regression based on the stability approach. We prove that StARS-RRR achieves rank estimation consistency, that is, the rank estimated with the tuning parameter selected by StARS-RRR is consistent to the true rank. Through a simulation study, we show that StARS-RRR outperforms other tuning parameter selection methods including AIC, BIC, and cross-validation as it provides the most accurate estimated rank. In addition, when applied to a breast cancer dataset, StARS-RRR discovers a reasonable number of genetic pathways that affect the DNA copy number variations and results in a smaller prediction error than the other methods with a random-splitting process. Supplementary materials for this article are available online.
Read full abstract