Statistical multimodal integration for intelligent HCI

Lizhong Wu Lizhong Wu,S.L Oviatt,P.R Cohen

doi:10.1109/nnsp.1999.788168

Abstract

This paper presents a statistical approach to developing multimodal recognition systems and, in particular, to integrating the posterior probabilities of parallel input signals involved in the multimodal system. We first derive the performance bounds of multimodal recognition probabilities, and identify the primary factors that influence multimodal recognition performance. We then develop a technique, a members-teams-committee (MTC) recognition approach, designed to optimize accurate recognition during the multimodal integration process. We evaluate these methods using Quickset, a speech/gesture multimodal system, and report evaluation results based on an empirical corpus collected with Quickset. From an architectural perspective, the integration technique presented offers enhanced robustness. It also is premised on more realistic assumptions than previous multimodal systems using semantic fusion. From a methodological standpoint, the evaluation techniques that we describe provide a valuable tool for evaluating multimodal systems.

Full Text