Abstract

In this paper the efficiency and usage patterns of input modes in multimodal dialogue systems is investigated for desktop and personal digital assistant (PDA) working environments. For this purpose a form-filling travel reservation system is designed and implemented that efficiently combines the speech and visual modalities; three multimodal modes of interaction are implemented, namely: click-to-talk, open-mike and modality-selection. The three multimodal systems are evaluated and compared with the GUI-only and speech-only unimodal systems. User interface evaluation includes both objective and subjective metrics and shows that all three multimodal systems outperform the unimodal systems on the PDA environment. For the desktop environment the multimodal systems score better than the speech-only system but worse than the GUI-only system. In all evaluation experiments, the synergy between the visual and speech modality was significant: the multimodal interface was better than the sum of its (unimodal) parts. Results also show that users tend to use the most efficient input mode.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call