Abstract
Ideal Binary Masking (IBM), using prior information, improves speech intelligibility by attenuating noisy components with a scaling factor applied to the noise. The main challenge is to construct an appropriate decision-making model to identify noise- or speech- dominant components. In this study, we utilized the signal-to-noise ratio (SNR) of the temporal amplitude envelope in the frequency-time domain. We firstly divided the noisy speech from 200 Hz to 6 kHz, processed by MATLAB, into 16 contiguous subbands each with bandwidth approximately 1.5 times an equivalent rectangular bandwidth. The subband envelopes were produced by means of the absolute value of the signal. SNRs of the temporal envelope were calculated for 40 ms windows. The mask was unity when the SNR was greater than −5dB; otherwise, it was 0.5. We evaluated the performance of the proposed IBM on word scores obtained with different speech in speech-spectrum shaped noise SNR values of −2, −4, −6, and −8 dB. Sixteen native speakers (age 28 ± 3 years) with normal hearing were recruited for the study and underwent the Modified Rhyme Test to assess intelligibility. Statistically significant increases of up to 20% in mean word scores were obtained by this IBM. [Work supported by NIOSH.]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.