Little is known about the influence of vocal emotions on speech understanding. Word recognition accuracy for stimuli spoken to portray seven emotions (anger, disgust, fear, sadness, neutral, happiness, and pleasant surprise) was tested in younger and older listeners. Emotions were presented in either mixed (heterogeneous emotions mixed in a list) or blocked (homogeneous emotion blocked in a list) conditions. Three main hypotheses were tested. First, vocal emotion affects word recognition accuracy; specifically, portrayals of fear enhance word recognition accuracy because listeners orient to threatening information and/or distinctive acoustical cues such as high pitch mean and variation. Second, older listeners recognize words less accurately than younger listeners, but the effects of different emotions on intelligibility are similar across age groups. Third, blocking emotions in list results in better word recognition accuracy, especially for older listeners, and reduces the effect of emotion on intelligibility because as listeners develop expectations about vocal emotion, the allocation of processing resources can shift from emotional to lexical processing. Emotion was the within-subjects variable: all participants heard speech stimuli consisting of a carrier phrase followed by a target word spoken by either a younger or an older talker, with an equal number of stimuli portraying each of seven vocal emotions. The speech was presented in multi-talker babble at signal to noise ratios adjusted for each talker and each listener age group. Listener age (younger, older), condition (mixed, blocked), and talker (younger, older) were the main between-subjects variables. Fifty-six students (Mage= 18.3 years) were recruited from an undergraduate psychology course; 56 older adults (Mage= 72.3 years) were recruited from a volunteer pool. All participants had clinically normal pure-tone audiometric thresholds at frequencies ≤3000 Hz. There were significant main effects of emotion, listener age group, and condition on the accuracy of word recognition in noise. Stimuli spoken in a fearful voice were the most intelligible, while those spoken in a sad voice were the least intelligible. Overall, word recognition accuracy was poorer for older than younger adults, but there was no main effect of talker, and the pattern of the effects of different emotions on intelligibility did not differ significantly across age groups. Acoustical analyses helped elucidate the effect of emotion and some intertalker differences. Finally, all participants performed better when emotions were blocked. For both groups, performance improved over repeated presentations of each emotion in both blocked and mixed conditions. These results are the first to demonstrate a relationship between vocal emotion and word recognition accuracy in noise for younger and older listeners. In particular, the enhancement of intelligibility by emotion is greatest for words spoken to portray fear and presented heterogeneously with other emotions. Fear may have a specialized role in orienting attention to words heard in noise. This finding may be an auditory counterpart to the enhanced detection of threat information in visual displays. The effect of vocal emotion on word recognition accuracy is preserved in older listeners with good audiograms and both age groups benefit from blocking and the repetition of emotions.
Read full abstract