In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation. Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions. We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target. These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.
Read full abstract