Abstract

Envelope method was recently proposed as a method to reduce the dimension of responses in multivariate regressions. However, when there exists missing data, the envelope method using the complete case observations may lead to biased and inefficient results. In this paper, we generalize the envelope estimation when the predictors and/or the responses are missing at random. Specifically, we incorporate the envelope structure in the expectation-maximization (EM) algorithm. As the parameters under the envelope method are not pointwise identifiable, the EM algorithm for the envelope method was not straightforward and requires a special decomposition. Our method is guaranteed to be more efficient, or at least as efficient as, the standard EM algorithm. Moreover, our method has the potential to outperform the full data MLE. We give asymptotic properties of our method under both normal and non-normal cases. The efficiency gain over the standard EM is confirmed in simulation studies and in an application to the Chronic Renal Insufficiency Cohort (CRIC) study.

Highlights

  • A new dimension reduction method called the envelope method has been proposed in the multivariate regressions [11]

  • Unlike the standard dimension reduction methods, the envelope method assumes the redundancy among responses rather than among predictors

  • Among the few biomarkers that remain significant, there is some discrepancy between the standard EM and our method: our method found high sensitivity C-reactive protein (HS CRP), aldosterone, and C-peptide significant which were not shown in standard EM; whereas standard EM found neutrophil gelatinase associated lipocalin (NGAL), which was not found in our method

Read more

Summary

Introduction

A new dimension reduction method called the envelope method has been proposed in the multivariate regressions [11]. Unlike the standard dimension reduction methods, the envelope method assumes the redundancy among responses rather than among predictors. We generalize the envelope method for data with missing predictors and responses. Our proposed method to recover the missing information can be generalized to the predictor envelope model where the redundancy is assumed among the predictors rather than the responses, as well as to the case where the redundancy is present among both the responses and the predictors. To the best of our knowledge, our paper is among the first few in the dimension reduction literature to discuss the case where both responses and predictors are subject to missingness.

Preliminary
The observed data likelihood
The EM updates
Selection of the envelope dimension
Asymptotics
Normal errors
Non-normal errors
Data analysis
Findings
Discussion
Our Method
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call