A Platform for Building New Human-Computer Interface Systems that Support Online Automatic Recognition of Audio-Gestural Commands

Nikolaos Kardaris,Vassilis Pitsikalis,Petros Maragos,Antonis Arvanitakis,Isidoros Rodomagoulakis

doi:10.1145/2964284.2973794

Abstract

We introduce a new framework to build human-computer interfaces that provide online automatic audio-gestural command recognition. The overall system allows the construction of a multimodal interface that recognizes user input expressed naturally as audio commands and manual gestures, captured by sensors such as Kinect. It includes a component for acquiring multimodal user data which is used as input to a module responsible for training audio-gestural models. These models are employed by the automatic recognition component, which supports online recognition of audio-visual modalities. The overall framework is exemplified by a working system use case. This demonstrates the potential of the overall software platform, which can be employed to build other new human-computer interaction systems. Moreover, users may populate libraries of models and/or data, that can be shared in the network. In this way users may reuse or extend existing systems.

Full Text