Abstract

Recent research on speech communication has revealed a tendency for speakers to imitate at least some of the characteristics of their interlocutor's speech sound shape. This phenomenon, referred to as phonetic convergence, entails a moment-to-moment adaptation of the speaker's speech targets to the perceived interlocutor's speech. It is thought to contribute to setting up a conversational common ground between speakers and to facilitate mutual understanding. However, it remains uncertain to what extent phonetic convergence occurs in voice fundamental frequency (F0), in spite of the major role played by pitch, F0's perceptual correlate, as a conveyor of both linguistic information and communicative cues associated with the speaker's social/individual identity and emotional state. In the present work, we investigated to what extent two speakers converge towards each other with respect to variations in F0 in a scripted dialogue. Pairs of speakers jointly performed a speech production task, in which they were asked to alternately read aloud a written story divided into a sequence of short reading turns. We devised an experimental set-up that allowed us to manipulate the speakers' F0 in real time across turns. We found that speakers tended to imitate each other's changes in F0 across turns that were both limited in amplitude and spread over large temporal intervals. This shows that, at the perceptual level, speakers monitor slow-varying movements in their partner's F0 with high accuracy and, at the production level, that speakers exert a very fine-tuned control on their laryngeal vibrator in order to imitate these F0 variations. Remarkably, F0 convergence across turns was found to occur in spite of the large melodic variations typically associated with reading turns. Our study sheds new light on speakers' perceptual tracking of F0 in speech processing, and the impact of this perceptual tracking on speech production.

Highlights

  • In spoken-language interactions, recent work has revealed that speakers tend to imitate their interlocutor’s own way of speaking

  • In our experimental set-up, the shift we introduced in each speaker’s F0 between two of their consecutive reading turns was always smaller than one-sixth of a tone

  • We found that the average F0 range of a turn, measured as the mean difference between the turn maximum and minimum values, was 10.94 semitones (SD = 3.00) across participants, close to one octave, and was much larger than the one-sixth of a tone shift between reading turns in each speaker

Read more

Summary

Introduction

In spoken-language interactions, recent work has revealed that speakers tend to imitate their interlocutor’s own way of speaking (see [1] for a recent review). Between-speaker convergence in F0 in a joint speech production task. ANR-08-BLAN-0276-01, ANR-16-CONV-0002 (ILCB) and ANR-11-LABX0036 (BLRI)), and of the European Research Council (erc.europa.eu) under the European Community’s Seventh Framework Program The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call