Abstract
The speech intelligibility of indoor public address systems is degraded by reverberation and background noise. This paper proposes a preprocessing method that combines speech enhancement and inverse filtering to improve the speech intelligibility in such environments. An energy redistribution speech enhancement method was modified for use in reverberation conditions, and an auditory-model-based fast inverse filter was designed to achieve better dereverberation performance. An experiment was performed in various noisy, reverberant environments, and the test results verified the stability and effectiveness of the proposed method. In addition, a listening test was carried out to compare the performance of different algorithms subjectively. The objective and subjective evaluation results reveal that the speech intelligibility is significantly improved by the proposed method.
Highlights
IntroductionThe bark scale is not an auditory model and cannot simulate the frequency response characteristics of the basilar membrane in the cochlea
An indoor public address (I-PA) system is a sound amplification system that is widely used in auditoriums, classrooms, factories, and conference rooms
Sound transmission in enclosed spaces is regarded as a linear time invariant (LTI) system [5, 6], so the output response of the system can be expressed as the convolution of the input signal and room impulse response (RIR)
Summary
The bark scale is not an auditory model and cannot simulate the frequency response characteristics of the basilar membrane in the cochlea These equalization methods do not account for the influence of background noise on speech intelligibility. The Multizone and ASII methods could improve the speech intelligibility in noisy and reverberant environments, the distortion of the speech transmission channel and the auditory features of the human ear were not considered at the same time during the signal preprocessing. This paper proposes a new preprocessing method for improving speech intelligibility by a combination of the PDMSE method and the FIF method. The gain function α is calculated by the PDMSE algorithm, and the inverse sub-filters vi are obtained by the GT-filter-based FIF algorithm Both parameters are used to adjust the preprocessing speech signal to obtain the best speech intelligibility.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: EURASIP Journal on Audio, Speech, and Music Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.