Abstract

Recognition performance for speech is generally worse for utterances produced by a mix of several talkers compared to utterances produced by a single talker. This performance impairment can be attributed to those aspects of talker normalization used to determine the vocal characteristics of the talker each time the talker changes. The present study investigated the size and nature of talker differences that may affect normalization. Spoken words were generated by a text-to-speech system for matched pairs of synthetically defined talkers. All but two of these pairs differed only in average fundamental frequency. One remaining pair of talkers differed in perceived gender but both talkers had the same average pitch; the other pair differed in both gender and pitch. Response times in a speeded word recognition task were compared for blocks of stimuli produced by a single talker and blocks of stimuli produced by a mix of one of the pairs of talkers. The results are important for understanding how listeners use pitch differences between talkers during talker normalization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.