Abstract

Children’s ability to distinguish speakers’ voices continues to develop throughout childhood, yet it remains unclear how children’s sensitivity to voice cues, such as differences in speakers’ gender, develops over time. This so-called voice gender is primarily characterized by speakers’ mean fundamental frequency (F0), related to glottal pulse rate, and vocal-tract length (VTL), related to speakers’ size. Here we show that children’s acquisition of adult-like performance for discrimination, a lower-order perceptual task, and categorization, a higher-order cognitive task, differs across voice gender cues. Children’s discrimination was adult-like around the age of 8 for VTL but still differed from adults at the age of 12 for F0. Children’s perceptual weight attributed to F0 for gender categorization was adult-like around the age of 6 but around the age of 10 for VTL. Children’s discrimination and weighting of F0 and VTL were only correlated for 4- to 6-year-olds. Hence, children’s development of discrimination and weighting of voice gender cues are dissociated, i.e., adult-like performance for F0 and VTL is acquired at different rates and does not seem to be closely related. The different developmental patterns for auditory discrimination and categorization highlight the complexity of the relationship between perceptual and cognitive mechanisms of voice perception.

Highlights

  • Children’s ability to distinguish speakers’ voices continues to develop throughout childhood, yet it remains unclear how children’s sensitivity to voice cues, such as differences in speakers’ gender, develops over time

  • Infants are already sensitive to differences in voice cues, such as fundamental frequency (F0)[3] or voice pitch[4], timbre differences associated with vocal-tract length[5], or prosody[6]

  • The perceived gender of a voice results from a combination of many acoustic features[26,27]. It is primarily defined by the mean F0, related to glottal pulse rate, and vocal-tract length (VTL), related to the size of the speaker[28]

Read more

Summary

Introduction

Children’s ability to distinguish speakers’ voices continues to develop throughout childhood, yet it remains unclear how children’s sensitivity to voice cues, such as differences in speakers’ gender, develops over time. These higher-order cognitive identification tasks depend on long-term exposure to voice cues and stored representations of voice categories in addition to perceptual processing[19,20] Despite these shared features, the source of the relatively slow development in sensitivity to differences in voice cues and the relation between discrimination and categorization of voice cues are not well understood. Mann et al.[7] proposed that the development in children’s ability to encode unfamiliar speakers’ voice characteristics may be caused by different processing strategies and reliance on different acoustic cues compared to adults This hypothesis is in agreement with the ‘Developmental Weighting Shift (DWS)’ model for categorical perception of phonemes by Nittrouer and Miller[18], which states that children weigh dynamic acoustic cues of speech (such as formant transitions) more than static acoustic speech cues (such as noise segments of consonants) due to their higher perceptual salience and more informative properties. F0, with its information-bearing dynamic fluctuations, seems to parallel dynamic formant transitions in the DWS model and may be subject to similar principles

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call