Abstract

Current referring expression generation systems mostly deliver their output as one-shot, written expressions. We present on-going work on incremental generation of spoken expressions referring to objects in real-world images. This approach extends upon previous work using the words-as-classifier model for generation. We implement this generator in an incremental dialogue processing framework such that we can exploit an existing interface to incremental text-to-speech synthesis. Our system generates and synthesizes referring expressions while continuously observing non-verbal user reactions.

Highlights

  • We present Refer-iTTS, a system that is meant to support research on real-time spoken REG and builds upon recent approaches to REG from realworld images (Kazemzadeh et al, 2014; Zarrieß and Schlangen, 2016)

  • While generating and synthesizing the RE, the system continuously observes the non-verbal reactions of the user and adapts the generated utterances to these actions in an incremental fashion

  • The system tries to be as cooperative as possible: if the user shows no reaction for a certain amount of time, the previous expression is expanded, i.e. the system splits its referring expression over several utterances, which is usually known as “reference in installments”, cf. (Zarrieß and Schlangen, 2016)

Read more

Summary

Introduction

We present Refer-iTTS, a system that is meant to support research on real-time spoken REG and builds upon recent approaches to REG from realworld images (Kazemzadeh et al, 2014; Zarrieß and Schlangen, 2016). We use the recently proposed words-as-classifiers (WAC) model for generation from low-level visual inputs and integrate it with InproTk (Baumann and Schlangen, 2012b), an opensource framework for incremental dialogue processing

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.