To overcome the computational complexity of the asynchronous hidden Markov model (AHMM), we present a novel multidimensional dynamic time warping (DTW) algorithm for hybrid fusion of asynchronous data. We show that our newly introduced multidimensional DTW concept requires significantly less decoding time while providing the same data fusion flexibility as the AHMM. Thus, it can be applied in a wide range of real-time multimodal classification tasks. Optimally exploiting mutual information during decoding even if the input streams are not synchronous, our algorithm outperforms late and early fusion techniques in a challenging bimodal speech and gesture fusion experiment.
Read full abstract