Abstract

We propose an extension of the expectation-maximization (EM) algorithm, called the hyperpenalized EM (HEM) algorithm, that maximizes a penalized log-likelihood, for which some data are missing or unavailable, using a data-adaptive estimate of the penalty parameter. This is potentially useful in applications for which the analyst is unable or unwilling to choose a single value of a penalty parameter but instead can posit a plausible range of values. The HEM algorithm is conceptually straightforward and also very effective, and we demonstrate its utility in the analysis of a genomic data set. Gene expression measurements and clinical covariates were used to predict survival time. However, many survival times are censored, and some observations only contain expression measurements derived from a different assay, which together constitute a difficult missing data problem. It is desired to shrink the genomic contribution in a data-adaptive way. The HEM algorithm successfully handles both the missing data and shrinkage aspects of the problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call