Abstract

The speech intelligibility of indoor public address systems is degraded by reverberation and background noise. This paper proposes a preprocessing method that combines speech enhancement and inverse filtering to improve the speech intelligibility in such environments. An energy redistribution speech enhancement method was modified for use in reverberation conditions, and an auditory-model-based fast inverse filter was designed to achieve better dereverberation performance. An experiment was performed in various noisy, reverberant environments, and the test results verified the stability and effectiveness of the proposed method. In addition, a listening test was carried out to compare the performance of different algorithms subjectively. The objective and subjective evaluation results reveal that the speech intelligibility is significantly improved by the proposed method.

Highlights

  • IntroductionThe bark scale is not an auditory model and cannot simulate the frequency response characteristics of the basilar membrane in the cochlea

  • An indoor public address (I-PA) system is a sound amplification system that is widely used in auditoriums, classrooms, factories, and conference rooms

  • Sound transmission in enclosed spaces is regarded as a linear time invariant (LTI) system [5, 6], so the output response of the system can be expressed as the convolution of the input signal and room impulse response (RIR)

Read more

Summary

Introduction

The bark scale is not an auditory model and cannot simulate the frequency response characteristics of the basilar membrane in the cochlea These equalization methods do not account for the influence of background noise on speech intelligibility. The Multizone and ASII methods could improve the speech intelligibility in noisy and reverberant environments, the distortion of the speech transmission channel and the auditory features of the human ear were not considered at the same time during the signal preprocessing. This paper proposes a new preprocessing method for improving speech intelligibility by a combination of the PDMSE method and the FIF method. The gain function α is calculated by the PDMSE algorithm, and the inverse sub-filters vi are obtained by the GT-filter-based FIF algorithm Both parameters are used to adjust the preprocessing speech signal to obtain the best speech intelligibility.

Improved preprocessing speech enhancement
Synthesis of preprocessing speech signals
Methods
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call