Abstract
Most of the existing speech enhancement algorithms are aimed at improving the quality of speech, and the algorithms that can improve the speech intelligibility effectively are rare. Speech intelligibility has been found to improve listening comfort and it is generally related to the distortion of the speech signal closely. Studies have assessed the impact of speech distortion introduced by gain functions and shown that one of the main reasons that existing algorithms cannot improve speech intelligibility is because they allow amplification distortions more than 6dB. Therefore, these distortions of the enhanced amplitude spectrum should be corrected to improve the speech intelligibility. The early research by Loizou et al. obtained the experimental results on the ideal state and we are unable to use it in reality because there is no clean speech in reality. In this paper, we modify the method proposed by Loizou et al. and select the estimated speech under two hypothetical conditions to verify the improvement of the speech intelligibility. The short-term objective intelligibility value verifies the improvement of speech intelligibility as the improved algorithm of speech intelligibility is applied to reality successfully.
Highlights
Humans communicate with the outside world through speech signals, but the ubiquitous noise in life can cause a lot of interference to voice communication
The clean speech signal and the noise signal were recorded at a sampling rate of 8 kHz, the quantization precision was 16 bits and the signal was 20ms per frame during processing with 50% overlap
The test condition of the modified algorithm is that 4 kinds of clean speech signals and 4 kinds of noise signals, which are respectively in 4 kinds of SNR environments(0dB、5dB、10dB、15dB) with two different estimated speech amplitude spectrum
Summary
Humans communicate with the outside world through speech signals, but the ubiquitous noise in life can cause a lot of interference to voice communication. Amplification distortion arising when the target signal is over-estimated (e.g., if we define a as the true value of the target envelope, and the estimated envelope is a + ∆a , ∆a is some positive increment), and attenuation distortion arising when the target signal is underestimated (e.g., a − ∆a represent the estimated envelope) It cannot be equivalent of the sensory effect of these two distortions on speech intelligibility, in practice, there has to exist the right balance between these two distortions, in most cases we do not know. In the start of the paper, we analyzed the impact of two types of speech distortion (amplification distortion and attenuation distortion) introduced by noise-suppressive gain functions. This paper summarized the reasons that existing algorithms do not improve the speech intelligibility and the methods to improve the speech intelligibility
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have