Abstract

Almost all animals exploit vocal signals for a range of ecologically-motivated purposes: detecting predators/prey and marking territory, expressing emotions, establishing social relations and sharing information. Whether it is a bird raising an alarm, a whale calling to potential partners, a dog responding to human commands, a parent reading a story with a child, or a business-person accessing stock prices using \emph{Siri}, vocalisation provides a valuable communication channel through which behaviour may be coordinated and controlled, and information may be distributed and acquired. Indeed, the ubiquity of vocal interaction has led to research across an extremely diverse array of fields, from assessing animal welfare, to understanding the precursors of human language, to developing voice-based human-machine interaction. Opportunities for cross-fertilisation between these fields abound; for example, using artificial cognitive agents to investigate contemporary theories of language grounding, using machine learning to analyse different habitats or adding vocal expressivity to the next generation of language-enabled autonomous social agents. However, much of the research is conducted within well-defined disciplinary boundaries, and many fundamental issues remain. This paper attempts to redress the balance by presenting a comparative review of vocal interaction within-and-between humans, animals and artificial agents (such as robots), and it identifies a rich set of open research questions that may benefit from an inter-disciplinary analysis.

Highlights

  • Almost all living organisms make sounds – even plants (Appel and Cocroft, 2014) – and many animals have specialized biological apparatus that is adapted to the perception and production of sound (Hopp and Evans, 1998)

  • What are the common features of vocal learning that these species share, and why is it restricted to only a few species? How does a young animal solve the correspondence problem between the vocalizations that they hear and the sounds that they can produce? Who should adapt to whom in order to establish an effective channel [see, for example, Bohannon and Marquis (1977), for a study showing that adults adapt their vocal interactivity with children based on comprehension feedback by the children]? How are vocal referents acquired? What, precisely, are the mechanisms underlying vocal learning?

  • What are the limitations of vocal interaction between non-conspecifics? What can be learned from attempts to teach animals, the human language? How do conspecifics accommodate mismatches in temporal histories or cultural experience? How can insights from such questions inform the design of vocally interactive artificial agents beyond Siri? Is it possible to detect differences in how different agents ground concepts from their language use, and can artificial agents use such information in vocal interactivity with humans [as suggested by Thill et al (2014)]?

Read more

Summary

INTRODUCTION

Almost all living organisms make (and make use of) sounds – even plants (Appel and Cocroft, 2014) – and many animals have specialized biological apparatus that is adapted to the perception and production of sound (Hopp and Evans, 1998). Systems have been created to analyze and playback animals calls, to investigate how vocal signaling might evolve in communicative agents, and to interact with users of spoken language technology.. Systems have been created to analyze and playback animals calls, to investigate how vocal signaling might evolve in communicative agents, and to interact with users of spoken language technology.3 The latter has witnessed huge commercial success in the past 10–20 years, since the release of Naturally Speaking (Dragon’s continuous speech dictation software for a PC) in 1997 and Siri (Apple’s voice-operated personal assistant and knowledge navigator for the iPhone) in 2011. Some of these fields, such as human spoken language or vocal interactivity between animals, have a long history of scientific research Others, such as vocal interaction between artificial agents or between artificial agents and animals, are less well studied – mainly due to the relatively recent appearance of the relevant technology. When reviewing research on specific aspects of human and/or animal vocal interactivity, we highlight questions pertaining to the design of future vocally interactive technologies that these raise

Physiology and Morphology
Properties and Function of Animal Signals
Structure
Human Language Evolution and Development
Interlocutor Abilities
Conveyance of Emotion
Comparative Analysis of Human and Animal Vocalization
Use of Vocalization
Vocal Interactivity between
Spoken Language Systems
TECHNOLOGY-BASED RESEARCH METHODS
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.