Data-adaptive Shrinkage via the Hyperpenalized EM Algorithm.

Philip S Boonstra,Bhramar Mukherjee,Jeremy M G Taylor

doi:10.1007/s12561-015-9132-x

Philip S Boonstra, Bhramar Mukherjee + Show 1 more

Open Access

https://doi.org/10.1007/s12561-015-9132-x

Copy DOI

Abstract

We propose an extension of the expectation-maximization (EM) algorithm, called the hyperpenalized EM (HEM) algorithm, that maximizes a penalized log-likelihood, for which some data are missing or unavailable, using a data-adaptive estimate of the penalty parameter. This is potentially useful in applications for which the analyst is unable or unwilling to choose a single value of a penalty parameter but instead can posit a plausible range of values. The HEM algorithm is conceptually straightforward and also very effective, and we demonstrate its utility in the analysis of a genomic data set. Gene expression measurements and clinical covariates were used to predict survival time. However, many survival times are censored, and some observations only contain expression measurements derived from a different assay, which together constitute a difficult missing data problem. It is desired to shrink the genomic contribution in a data-adaptive way. The HEM algorithm successfully handles both the missing data and shrinkage aspects of the problem.

Full Text