Abstract

This paper describes a novel framework to sub-band based Histogram Equalization (HEQ) applied to robust speech recognition. We propose a frequency band specific equalization to compensate the noise distortion on the individual frequency bands. The proposed equalization framework is a two step process. In the first step, conventional histogram equalization is done. By analyzing the histograms of equalized cepstra, we show that the first stage of conventional HEQ approach does not compensate the sub-band specific noise distortion, even though the overall histogram is normalized. Hence, in the second stage, sub-band specific histogram equalization is done. Every frame of cepstral coefficients is decomposed into low-frequency (LF) cepstra and high-frequency (HF) cepstra. Separate equalization is done on LF and HF cepstra to compensate LF and HF specific noise distortion. The cepstra corresponding to the LF and HF bands are obtained by using simple averaging and differencing filters on the cepstral components within a particular frame. The proposed approach is referred to as Sub-band Histogram Equalization (S-HEQ). Using histogram analysis, we show that the S-HEQ approach is able to compensate for the sub-band specific noise distortion. S-HEQ approach shows a consistent improvement over the conventional HEQ approach with a relative improvement of 12% and 22.10% over conventional HEQ in WER on Aurora-2 and Aurora-4 databases respectively. Proposed equalization approach can also be used with the deep neural network based systems and has shown a consistent improvement in the recognition accuracies over conventional HEQ. Finally, the efficacy of the proposed S-HEQ approach for embedded real-time speech applications is shown by comparing the performance and computational complexity trade-off with other state-of-the-art noise compensation methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.