Designing the User Interface for Multimodal Speech and Pen-Based Gesture Applications: State-of-the-Art Systems and Future Research Directions

Sharon Oviatt,Phil Cohen,Lizhong Wu,Lisbeth Duncan,Bernhard Suhm,Josh Bers,Thomas Holzman,Terry Winograd,James Landay,Jim Larson,David Ferro

doi:10.1207/s15327051hci1504_1

Abstract

The growing interest in multimodal interface design is inspired in large part by the goals of supporting more transparent, flexible, efficient, and powerfully expressive means of human-computer interaction than in the past. Multimodal interfaces are expected to support a wider range of diverse applications, be usable by a broader spectrum of the average population, and function more reliably under realistic and challenging usage conditions. In this article, we summarize the emerging architectural approaches for interpreting speech and pen-based gestural input in a robust manner-including early and late fusion approaches, and the new hybrid symbolic-statistical approach. We also describe a diverse collection of state-of-the-art multimodal systems that process users' spoken and gestural input. These applications range from map-based and virtual reality systems for engaging in simulations and training, to field medic systems for mobile use in noisy environments, to web-based transactions and standard text-editing applications that will reshape daily computing and have a significant commercial impact. To realize successful multimodal systems of the future, many key research challenges remain to be addressed. Among these challenges are the development of cognitive theories to guide multimodal system design, and the development of effective natural language processing, dialogue processing, and error-handling techniques. In addition, new multimodal systems will be needed that can function more robustly and adaptively, and with support for collaborative multiperson use. Before this new class of systems can proliferate, toolkits also will be needed to promote software development for both simulated and functioning systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Designing the User Interface for Multimodal Speech and Pen-Based Gesture Applications: State-of-the-Art Systems and Future Research Directions

Abstract

Talk to us

Similar Papers

More From: Human–Computer Interaction

Lead the way for us

Journal: Human–Computer Interaction	Publication Date: Dec 1, 2000
Citations: 393

Similar Papers

Using cognitive models to understand multimodal processes: the case for speech and gesture production
Stefan Kopp ... Kirsten Bergmann
-
Stefan Kopp, et. al.Stefan Kopp ... Kirsten Bergmann
24 Apr 2017
24 Apr 2017

Chapter 12 - Multimodal Input
Natalie Ruiz ... Sharon Oviatt
Multimodal Signal Processing | VOL. -
Natalie Ruiz, et. al.Natalie Ruiz ... Sharon Oviatt
01 Jan 2009
Multimodal Signal Processing | VOL. -

Multi-modal fusion for associated news story retrieval
Ehsan Younessian ... Deepu Rajan
Multimedia Tools and Applications | VOL. 74
Ehsan Younessian, et. al.Ehsan Younessian ... Deepu Rajan
08 Mar 2013
Multimedia Tools and Applications | VOL. 74

Building a Practical Multimodal System with a Multimodal Fusion Module
Yong Sun ... Vera Chung
-
Yong Sun, et. al.Yong Sun ... Vera Chung
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Designing the User Interface for Multimodal Speech and Pen-Based Gesture Applications: State-of-the-Art Systems and Future Research Directions

Abstract

Talk to us

Similar Papers

More From: Human–Computer Interaction