“You, Move There!”: Investigating the Impact of Feedback on Voice Control in Virtual Environments

Mitchell Baxter,Justin Edwards,Leigh Clark,Julie R Williamson,Benjamin R Cowan,Anna Bleakley

doi:10.1145/3469595.3469609

Mitchell Baxter, Justin Edwards + Show 4 more

Open Access

PDF Available

https://doi.org/10.1145/3469595.3469609

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Current virtual environment (VEs) input techniques often overlook speech as a useful control modality. Speech could improve interaction in multimodal VEs by enabling users to address objects, locations, and agents, yet research on how to design effective speech for VEs is limited. Our paper investigates the effect of agent feedback on speech VE experiences. Through a lab study, users commanded agents to navigate a VE, receiving either auditory, visual or behavioural feedback. Based on a post interaction semi-structured interview, we find that the type of feedback given by agents is critical to user experience. Specifically auditory mechanisms are preferred, allowing users to engage with other modalities seamlessly during interaction. Although command-like utterances were frequently used, it was perceived as contextually appropriate, ensuring users were understood. Many also found it difficult to discover speech-based functionality. Drawing on these, we discuss key challenges for designing speech input for VEs.

Highlights

Speech is often overlooked as an input technique for virtual environments (VEs)
We found that auditory feedback was preferred above visual and agent behavioural feedback, giving participants the ability to engage with other input modalities, allowing them to scan their environment and move on to other tasks more effectively
Speech is growing in popularity as an input modality, yet is not heavily used in VE interactions

Summary

Introduction

Speech is often overlooked as an input technique for virtual environments (VEs). Current research and commercial products have typically focused on gesture, gaze, and locomotion, but input techniques that require physical action have limitations that make them unsatisfying in otherwise high fidelity VE interactions [11]. The mapping of continuous human motion to discrete controls presents serious challenges, and physical inputs have limitations including low information capacity[45], fatigue [46], and encumbrance [26]. Incorporating speech as a modality for VE interaction could overcome these challenges by adding a familiar and information rich input technique to existing physical inputs. Speech as part of a multimodal experience was demonstrated in the foundational “Put that there” [3] system, but exploring how speech input works in a modern multimodal VE presents a new series of challenges and opportunities. By incorporating speech with traditional input techniques in VEs, we have an opportunity to develop experiences offering deeply instinctive control for a variety of situations and tasks

Objectives

Methods

Results

Discussion

Conclusion