Abstract
This study explored the extent to which informational masking of speech depends on the frequency region and number of extraneous formants in an interferer. Target formants-monotonized three-formant (F1+F2+F3) analogues of natural sentences-were presented monaurally, with target ear assigned randomly on each trial. Interferers were presented contralaterally. In experiment 1, single-formant interferers were created using the time-reversed F2 frequency contour and constant amplitude, root-mean-square (RMS)-matched to F2. Interferer center frequency was matched to that of F1, F2, or F3, while maintaining the extent of formant-frequency variation (depth) on a log scale. Adding an interferer lowered intelligibility; the effect of frequency region was small and broadly tuned around F2. In experiment 2, interferers comprised either one formant (F1, the most intense) or all three, created using the time-reversed frequency contours of the corresponding targets and RMS-matched constant amplitudes. Interferer formant-frequency variation was scaled to 0%, 50%, or 100% of the original depth. Increasing the depth of formant-frequency variation and number of formants in the interferer had independent and additive effects. These findings suggest that the impact on intelligibility depends primarily on the overall extent of frequency variation in each interfering formant (up to ∼100% depth) and the number of extraneous formants.
Highlights
An important requirement for successful communication in the auditory scenes often encountered in everyday life is the ability of the listener to attend to the speech of the talker despite the presence of interfering sounds, including other speech (Cherry, 1953; see Bregman, 1990; Darwin, 2008)
Speech is a sparse signal in a frequency  time representation, and so there are often circumstances in which the interference arises mainly from informational masking—e.g., when the target talker is accompanied by one competing talker with a similar level (Brungart, 2001; Brungart et al, 2006; see Darwin, 2008)
All speech analogues were synthesized using MITSYN (Henke, 2005) at a sample rate of 22.05 kHz and with 10-ms raised-cosine onset and offset ramps. They were played at 16-bit resolution over Sennheiser HD 480-13II earphones (Hannover, Germany) via a Sound Blaster X-Fi HD sound card (Creative Technology Ltd, Singapore), programmable attenuators [Tucker-Davis Technologies (TDT) PA5; Alachua, FL], and a headphone buffer (TDT HB7)
Summary
An important requirement for successful communication in the auditory scenes often encountered in everyday life is the ability of the listener to attend to the speech of the talker despite the presence of interfering sounds, including other speech (Cherry, 1953; see Bregman, 1990; Darwin, 2008). Some studies have investigated the ability to listen with independent ears by asking listeners to identify monaural target speech when presented alone or accompanied by a contralateral masker whose properties have been manipulated in various ways (e.g., Brungart et al, 2005; Gallun et al, 2007) This general approach has been extended to an arrangement in which a simplified three-formant analogue of a sentencelength utterance (F1þF2þF3) is accompanied in the contralateral ear by a single-formant interferer (Roberts and Summers, 2015; Summers et al, 2016). One possible explanation for why the impact of F2C plateaus once its depth of formant-frequency variation exceeds 100% is because the presence of extraneous acoustic-phonetic information in frequency regions outside the typical F2 range caused little additional interference. The current study explored the extent to which informational masking is governed by the frequency region occupied by an interferer, the extent of formant-frequency variation in each interfering formant, and the number of interfering formants
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.