Abstract

Ideal Binary Masking (IBM), using prior information, improves speech intelligibility by attenuating noisy components with a scaling factor applied to the noise. The main challenge is to construct an appropriate decision-making model to identify noise- or speech- dominant components. In this study, we utilized the signal-to-noise ratio (SNR) of the temporal amplitude envelope in the frequency-time domain. We firstly divided the noisy speech from 200 Hz to 6 kHz, processed by MATLAB, into 16 contiguous subbands each with bandwidth approximately 1.5 times an equivalent rectangular bandwidth. The subband envelopes were produced by means of the absolute value of the signal. SNRs of the temporal envelope were calculated for 40 ms windows. The mask was unity when the SNR was greater than −5dB; otherwise, it was 0.5. We evaluated the performance of the proposed IBM on word scores obtained with different speech in speech-spectrum shaped noise SNR values of −2, −4, −6, and −8 dB. Sixteen native speakers (age 28 ± 3 years) with normal hearing were recruited for the study and underwent the Modified Rhyme Test to assess intelligibility. Statistically significant increases of up to 20% in mean word scores were obtained by this IBM. [Work supported by NIOSH.]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call