Using Complexity-Identical Human- and Machine-Directed Utterances to Investigate Addressee Detection for Spoken Dialogue Systems.

Oleg Akhtiamov,Alexey Karpov,Wolfgang Minker,Ingo Siegert

doi:10.3390/s20092740

Abstract

Human-machine addressee detection (H-M AD) is a modern paralinguistics and dialogue challenge that arises in multiparty conversations between several people and a spoken dialogue system (SDS) since the users may also talk to each other and even to themselves while interacting with the system. The SDS is supposed to determine whether it is being addressed or not. All existing studies on acoustic H-M AD were conducted on corpora designed in such a way that a human addressee and a machine played different dialogue roles. This peculiarity influences speakers’ behaviour and increases vocal differences between human- and machine-directed utterances. In the present study, we consider the Restaurant Booking Corpus (RBC) that consists of complexity-identical human- and machine-directed phone calls and allows us to eliminate most of the factors influencing speakers’ behaviour implicitly. The only remaining factor is the speakers’ explicit awareness of their interlocutor (technical system or human being). Although complexity-identical H-M AD is essentially more challenging than the classical one, we managed to achieve significant improvements using data augmentation (unweighted average recall (UAR) = 0.628) over native listeners (UAR = 0.596) and a baseline classifier presented by the RBC developers (UAR = 0.539).

Highlights

Spoken dialogue systems (SDSs) appeared a couple of decades ago and have already become part of our everyday life
Complexity-identical human-machine addressee detection (H-M AD) is essentially more challenging than the classical one, we managed to achieve significant improvements using data augmentation (unweighted average recall (UAR) = 0.628) over native listeners (UAR = 0.596) and a baseline classifier presented by the Restaurant Booking Corpus (RBC) developers (UAR = 0.539)
E.g., Siri, Cortana, Alexa, and Alisa, are typical examples of modern spoken dialogue system (SDS). Such systems face the problem of human-machine addressee detection (H-M AD) that arises in multiparty spoken conversations between several people and an Sensors 2020, 20, 2740; doi:10.3390/s20092740

Summary

Introduction

Spoken dialogue systems (SDSs) appeared a couple of decades ago and have already become part of our everyday life. Speech is the most natural way of communication between people, and they usually prefer speech-based user interfaces over textual and graphical input alone when it comes to natural interaction with technical systems [1]. Considerable progress has been made towards adaptive SDSs [5] and understanding multiparty conversations [6,7,8]. E.g., Siri, Cortana, Alexa, and Alisa, are typical examples of modern SDSs. Virtual assistants, e.g., Siri, Cortana, Alexa, and Alisa, are typical examples of modern SDSs Such systems face the problem of human-machine addressee detection (H-M AD) that arises in multiparty spoken conversations between several people and an Sensors 2020, 20, 2740; doi:10.3390/s20092740 www.mdpi.com/journal/sensors

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: May 11, 2020
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Using Complexity-Identical Human- and Machine-Directed Utterances to Investigate Addressee Detection for Spoken Dialogue Systems.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

Are You Addressing Me? Multimodal Addressee Detection in Human-Human-Computer Conversations
Oleg Akhtiamov ... Alexey Karpov
-
Oleg Akhtiamov, et. al.Oleg Akhtiamov ... Alexey Karpov
01 Jan 2017
01 Jan 2017

A statistical approach for estimating user satisfaction in spoken human-machine interaction
Alexander Schmitt ... Benjamin Schatz
-
Alexander Schmitt, et. al.Alexander Schmitt ... Benjamin Schatz
01 Dec 2011
01 Dec 2011

Semi-supervised learning for medical image classification using imbalanced training data
Tri Huynh ... Zhen He
Computer methods and programs in biomedicine | VOL. 216
Tri Huynh, et. al.Tri Huynh ... Zhen He
14 Jan 2022
Computer methods and programs in biomedicine | VOL. 216

Application and Evaluation of a Conditioned Hidden Markov Model for Estimating Interaction Quality of Spoken Dialogue Systems
Stefan Ultes ... Robert ElChab
-
Stefan Ultes, et. al.Stefan Ultes ... Robert ElChab
28 Aug 2013
28 Aug 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Complexity-Identical Human- and Machine-Directed Utterances to Investigate Addressee Detection for Spoken Dialogue Systems.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)