Abstract

When producing intimidating aggressive vocalizations, humans and other animals often extend their vocal tracts to lower their voice resonance frequencies (formants) and thus sound big. Is acoustic size exaggeration more effective when the vocal tract is extended before, or during, the vocalization, and how do listeners interpret within-call changes in apparent vocal tract length? We compared perceptual effects of static and dynamic formant scaling in aggressive human speech and nonverbal vocalizations. Acoustic manipulations corresponded to elongating or shortening the vocal tract either around (Experiment 1) or from (Experiment 2) its resting position. Gradual formant scaling that preserved average frequencies conveyed the impression of smaller size and greater aggression, regardless of the direction of change. Vocal tract shortening from the original length conveyed smaller size and less aggression, whereas vocal tract elongation conveyed larger size and more aggression, and these effects were stronger for static than for dynamic scaling. Listeners familiarized with the speaker's natural voice were less often ‘fooled’ by formant manipulations when judging speaker size, but paid more attention to formants when judging aggressive intent. Thus, within-call vocal tract scaling conveys emotion, but a better way to sound large and intimidating is to keep the vocal tract consistently extended.

Highlights

  • Looking and sounding impressively large is often advantageous for group-living animals, during dominance displays in males, leading to anatomical and behavioural adaptations for advertising—and often exaggerating—body size [1,2]

  • Smiling speech can be considered a type of dynamic vocal tract length (VTL) control because spreading the lips shortens the vocal tract and temporarily raises formant frequencies [22], and the resulting ‘auditory smile’ is perceived by listeners as an expression of happiness [23]

  • F3 F2 F1 rising high falling between the average apparent VTL over time and its direction of change. To complement this controlled manipulation with a more ecologically valid scenario, in Experiment 2, we presented a different sample of listeners with angry utterances in which formant frequencies were experimentally scaled statically or dynamically from the same neutral value

Read more

Summary

Introduction

Looking and sounding impressively large is often advantageous for group-living animals, during dominance displays in males, leading to anatomical and behavioural adaptations for advertising—and often exaggerating—body size [1,2]. A noticeable increase in average VTL was reported for angry and sad compared to happy speech in a recent imaging study [12] This suggests that vocalizers capable of changing their VTL may do so not merely to sound large, but to express a range of emotions and intentions. Smiling speech can be considered a type of dynamic VTL control because spreading the lips shortens the vocal tract and temporarily raises formant frequencies [22], and the resulting ‘auditory smile’ is perceived by listeners as an expression of happiness [23]. The authors speculated that changes in VTL might communicate an effort to exaggerate or extenuate apparent size, which listeners in turn interpret as an expression of intent or emotion This pioneering study employed short isolated vowels, which have limited ecological validity, and the results depended on the tested vowel. Because the average formant frequencies, and the average apparent VTL, were not affected by dynamic manipulations, this design can be seen as a model of VTL changing around its neutral value, which enabled us to distinguish (a) formant shifts high falling original

F2 F1 rising high falling
Experiment 1
Stimuli
Manipulations
Procedure
Participants
Data analysis
Results
Height
Aggression
Emotion intensity
Authenticity
Experiment 2
General discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call