Abstract

This paper describes intelligibility improvement method for speech signal based on subband waveform processing. Our approach is based on the observation that clear speech has higher delta-cepstrum value in transient parts between phonemes, and emphasizes delta-cepstrum of input speech by a filter in the cepstral domain which amplifies a particular modulation frequency. However, since this approach generates synthetic sound by using an analysis/synthesis system, quality of the generated sound is sometimes degraded. To prevent this degradation, a subband waveform-based method is introduced. This method divides an input signal into several subband signals by a quadrature mirror filter (QMF) which approximately enables perfect reconstruction of input signal from the subband signals, converts an amplification gain sequence in the cepstral domain into that in the subband-waveform domain, and then multiplies the converted amplification gain sequence to the subband signal on a sample-by-sample basis. Synthetic sounds were generated by the method in the cases where the number of subbands is set to two, four, and eight. We found that the sound that the number of subbands is two includes artificial power fluctuation, and increasing the number of subbands decreases the artificial power fluctuation and makes quality of the generated sound better.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.