Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions

Shay Mozes,Oren Weimann,Michal Ziv-Ukelson

doi:10.1007/978-3-540-73437-6_4

Abstract

We present a method to speed up the dynamic program algorithms used for solving the HMM decoding and training problems for discrete time-independent HMMs. We discuss the application of our method to Viterbi’s decoding and training algorithms [21], as well as to the forward-backward and Baum-Welch [4] algorithms. Our approach is based on identifying repeated substrings in the observed input sequence. We describe three algorithms based alternatively on byte pair encoding (BPE) [19], run length encoding (RLE) and Lempel-Ziv (LZ78) parsing [12]. Compared to Viterbi’s algorithm, we achieve a speedup of Ω(r) using BPE, a speedup of \(\Omega(\frac{r}{\log r})\) using RLE, and a speedup of \(\Omega(\frac{\log n}{k})\) using LZ78, where k is the number of hidden states, n is the length of the observed sequence and r is its compression ratio (under each compression scheme). Our experimental results demonstrate that our new algorithms are indeed faster in practice. Furthermore, unlike Viterbi’s algorithm, our algorithms are highly parallelizable.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions
Yury Lifshits ... Shay Mozes
Algorithmica | VOL. 54
Yury Lifshits, et. al.Yury Lifshits ... Shay Mozes
28 Nov 2007
Algorithmica | VOL. 54

BPE-Dropout: Simple and Effective Subword Regularization
Ivan Provilkov ... Elena Voita
-
Ivan Provilkov, et. al.Ivan Provilkov ... Elena Voita
01 Jan 2020
01 Jan 2020

Controlling byte pair encoding for neural machine translation
Alfred John Tacorda ... Rachel Edita Roxas
-
Alfred John Tacorda, et. al.Alfred John Tacorda ... Rachel Edita Roxas
01 Dec 2017
01 Dec 2017

String Matching Over Compressed Text on Handheld Devices Using Tagged Sub-Optimal Code (TSC)
Abdelghani Bellaachia ... Iehab Al Rassan
Real-Time Systems | VOL. 29
Abdelghani Bellaachia, et. al.Abdelghani Bellaachia ... Iehab Al Rassan
01 Mar 2005
Real-Time Systems | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions

Abstract

Talk to us

Similar Papers