Abstract

Listeners can recognize newly learned voices from previously unheard utterances, suggesting the acquisition of high-level speech-invariant voice representations during learning. Using functional magnetic resonance imaging (fMRI) we investigated the anatomical basis underlying the acquisition of voice representations for unfamiliar speakers independent of speech, and their subsequent recognition among novel voices. Specifically, listeners studied voices of unfamiliar speakers uttering short sentences and subsequently classified studied and novel voices as “old” or “new” in a recognition test. To investigate “pure” voice learning, i.e., independent of sentence meaning, we presented German sentence stimuli to non-German speaking listeners. To disentangle stimulus-invariant and stimulus-dependent learning, during the test phase we contrasted a “same sentence” condition in which listeners heard speakers repeating the sentences from the preceding study phase, with a “different sentence” condition. Voice recognition performance was above chance in both conditions although, as expected, performance was higher for same than for different sentences. During study phases activity in the left inferior frontal gyrus (IFG) was related to subsequent voice recognition performance and same versus different sentence condition, suggesting an involvement of the left IFG in the interactive processing of speaker and speech information during learning. Importantly, at test reduced activation for voices correctly classified as “old” compared to “new” emerged in a network of brain areas including temporal voice areas (TVAs) of the right posterior superior temporal gyrus (pSTG), as well as the right inferior/middle frontal gyrus (IFG/MFG), the right medial frontal gyrus, and the left caudate. This effect of voice novelty did not interact with sentence condition, suggesting a role of temporal voice-selective areas and extra-temporal areas in the explicit recognition of learned voice identity, independent of speech content.

Highlights

  • In daily social interactions we recognize familiar people from their voices across various utterances (Skuk & Schweinberger, 2013)

  • This relatively widespread network may serve several subfunctions: During voice learning brain activity in the left inferior frontal gyrus (IFG) was related to subsequent voice recognition performance which further interacted with speech content

  • This suggests that the left IFG mediates the interactive processing of speaker and speech information while new voice representations are being built

Read more

Summary

Introduction

In daily social interactions we recognize familiar people from their voices across various utterances (Skuk & Schweinberger, 2013). Functional magnetic resonance imaging (fMRI) research suggests that following low-level analysis in temporal primary auditory cortices, voices are structurally encoded and compared to long-term voice representations in bilateral temporal voice areas (TVAs) predominantly of the right STS (Belin, Zatorre, Lafaille, Ahad, & Pike, 2000; Pernet et al, 2015). This is in line with hierarchical models of voice processing (Belin, Fecteau, & Bedard, 2004; Belin et al 2011). While previous studies have used various tasks and levels of voice familiarity to identify the neural correlates of voice identity processing, the neural mechanisms mediating the acquisition of high-level (invariant) voice representations during learning and subsequent recognition remain poorly explored

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call