Can dual-carrier processing restore masking release in vocoded speech?

Brittney Carter,Eric Healy,Frederic Apoux

doi:10.1121/1.4988439

Abstract

Speech processed to replace the original temporal fine structure (TFS) with tones or noise carriers (vocoder processing) are generally less intelligible than natural or unprocessed speech, especially if a background noise is present. Moreover, the poorer intelligibility associated with vocoder processing is typically larger if the background fluctuates over time. This deleterious effect of vocoder processing has led to the postulate that TFS cues play a critical role when listening into the dips in the background. Recently, we have proposed a technique to reintroduce synthetic TFS cues in vocoded speech using one carrier for the target and one carrier for the background. This “dual-carrier” approach allows sentence intelligibility with a speech masker to reach a level almost comparable to that of natural speech. The goal of the present study was to investigate the extent to which dual-carrier processing generally improves speech recognition in various noises or if it truly compensates for the loss of TFS cues, therefore engendering masking release, as does natural speech. Results comparing masking release for three processing conditions (single-carrier, dual-carrier, and natural speech) in five backgrounds (speech-shaped noise, speech-modulated noise, and 1, 2, or 8 talkers) will be discussed.

Full Text