Abstract
AbstractEstimators derived from the expectation‐maximization (EM) algorithm are not robust since they are based on the maximization of the likelihood function. We propose an iterative proximal‐point algorithm based on the EM algorithm to minimize a divergence criterion between a mixture model and the unknown distribution that generates the data. The algorithm estimates in each iteration the proportions and the parameters of the mixture components in two separate steps. Resulting estimators are generally robust against outliers and misspecification of the model. Convergence properties of our algorithm are studied. The convergence of the introduced algorithm is discussed on a two‐component Weibull mixture entailing a condition on the initialization of the EM algorithm in order for the latter to converge. Simulations on Gaussian and Weibull mixture models using different statistical divergences are provided to confirm the validity of our work and the robustness of the resulting estimators against outliers in comparison to the EM algorithm. An application to a dataset of velocities of galaxies is also presented. The Canadian Journal of Statistics 47: 392–408; 2019 © 2019 Statistical Society of Canada
Highlights
The expectation-maximization (EM) algorithm (Dempster, Laird & Rubin, 1977) is a well-known method for calculating the maximum likelihood estimator (MLE) of a model where incomplete data are considered
We propose to calculate the two MDφDEs and the minimum density power divergence (MDPD) when pφ is a mixture model using an iterative procedure based on the work of Tseng (2004) on the log-likelihood function
We measure the error of replacing the true distribution of the data with the model using the total variation distance (TVD) which is calculated using the L1 distance by the Scheffé lemma (e.g., Meister, 2009, p. 129)
Summary
The expectation-maximization (EM) algorithm (Dempster, Laird & Rubin, 1977) is a well-known method for calculating the maximum likelihood estimator (MLE) of a model where incomplete data are considered. Several variants of the EM algorithm are available; see McLachlan & Krishnan (2007). Another way to look at the EM algorithm is as a proximal-point problem; see Chrétien & Hero (1998) and Tseng (2004). We may rewrite the conditional expectation of the complete log-likelihood as the log-likelihood function of the model (the objective) plus a proximal term. The proximal term has a regularization effect on the objective function so that the algorithm becomes more stable, could avoid some saddle points and frequently outperforms classical optimization algorithms; see Goldstein & Russak (1987) and Chrétien & Hero (2008). Notice that EM-type algorithms usually enjoy no more than linear convergence
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.