Abstract

Expectation maximization (EM) is a technique for estimating maximum-likelihood parameters of a latent variable model given observed data by alternating between taking expectations of sufficient statistics, and maximizing the expected log likelihood. For situations where sufficient statistics are intractable, stochastic approximation EM (SAEM) is often used, which uses Monte Carlo techniques to approximate the expected log likelihood. Two common implementations of SAEM, Batch EM (BEM) and online EM (OEM), are parameterized by a “learning rate”, and their efficiency depend strongly on this parameter. We propose an extension to the OEM algorithm, termed Introspective Online Expectation Maximization (IOEM), which removes the need for specifying this parameter by adapting the learning rate to trends in the parameter updates. We show that our algorithm matches the efficiency of the optimal BEM and OEM algorithms in multiple models, and that the efficiency of IOEM can exceed that of BEM/OEM methods with optimal learning rates when the model has many parameters. Finally we use IOEM to fit two models to a financial time series. A Python implementation is available at https://github.com/luntergroup/IOEM.git.

Highlights

  • Expectation Maximization (EM) is a general and widely used technique for estimating maximum likelihood parameters of latent variable models (Dempster et al 1977)

  • We propose an extension to the online EM (OEM) algorithm, termed Introspective Online Expectation Maximization (IOEM), which removes the need for specifying this parameter by adapting the learning rate to trends in the parameter updates

  • We show that our algorithm matches the efficiency of the optimal Batch EM (BEM) and OEM algorithms in multiple models, and that the efficiency of IOEM can exceed that of BEM/OEM methods with optimal learning rates when the model has many parameters

Read more

Summary

Introduction

Expectation Maximization (EM) is a general and widely used technique for estimating maximum likelihood parameters of latent variable models (Dempster et al 1977). Le Corff and Fort (2013) introduced a “block online” EM algorithm for hidden Markov models that combines online and batch ideas, controlling convergence through a block size sequence τk All these algorithms require choosing tuning parameters in the form of a batch size, block sequence, learning rate or a learning schedule. In the context of (stochastic) gradient descent optimization (Bottou 2012), several influential adaptive algorithms have recently been proposed (Zeiler 2012; Kingma and Ba 2015; Mandt et al 2016; Reddi et al 2018) that have few or no tuning parameters In principle, these methods can be used to find maximum likelihood parameters, but unless data is processed in batches, applying these methods to state-space models with a sequential structure is not straightforward.

EM algorithms for a simplified autoregressive model
Batch expectation maximization
Online expectation maximization
Introspective online expectation maximization
The IOEM algorithm for the full autoregressive model
Simulations
Inference with the simplified IOEM algorithm
Inference with the complete IOEM algorithm
Inference of multiple parameters
Inference of parameters of a stochastic volatility model
Application to financial time series
Conclusion
Fixed-lag technique
Weighted regression
Pseudo-independent parameter updates
Associated methods
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call