Abstract

In this paper, we propose to operate the sub-band division via discrete wavelet transform (DWT) before the process of gain normalization (GN) in producing speech features. In the presented approach, we apply the DWT to decompose the temporal-domain cepstral feature sequence, and then perform the gain normalization on each sub-band feature stream. Finally, the new feature stream is obtained by the inverse DWT of all sub-band streams. Compared with the gain normalization process directly performed on the original full-band stream, the presented approach can deal with the sub-band distortions individually and is expected to be more noise-robust. In the Aurora-2 database and task, this new sub-band GN outperforms the baseline process and the original full-band GN by 65.51% and 18.20% in relative word error reduction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.