Abstract
A discrete cosine transform (DCT) domain speech enhancement algorithm is proposed that models the evolution of speech DCT coefficients as a time-varying autoregressive process. Rao-Blackwellized particle filter (RBPF) techniques are used to estimate the model parameters and recover the clean signal coefficients. Using very low-order models for each coefficient and operating at a decimated frame rate, the proposed approach provides a significant complexity reduction compared to the standard full-band RBPF speech enhancement algorithm. In addition to the complexity gains, performance is also improved. Modeling the speech signal in the DCT-domain is shown to provide a better fit in spectral troughs, leading to more noise reduction and less speech distortion. To illustrate possible frequency-dependent processing strategies, a hybrid structure is proposed that offers a complexity/performance trade-off by substituting a simple DCT Wiener filter for the DCT-RBPF in some bands. In comparisons with high performing speech enhancement algorithms using wideband speech and noise, the proposed DCT-RBPF algorithm achieves higher scores on objective quality and intelligibility measures.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have