Abstract

Single-channel speech enhancement based on short-time spectral amplitude (STSA) estimation often uses the unmodified phase spectrum for speech re-synthesis, thereby introducing undesired artifacts to the enhanced speech. Using discrete Cosine transform (DCT) instead of discrete Fourier transform (DFT) reduces the effects of such issues because the consequences of using noisy DCT polarities for speech re-synthesis are less severe than using the noisy DFT phases. Although DFT-based STSA estimators have been adequately studied in the past, such estimators have not sufficiently been developed for the DCT domain. This study aims to demonstrate the superiority of DCT representation in STSA estimation-based speech enhancement. To achieve this, we first derive the DCT-based STSA estimator which minimizes the mean squared error (MSE) of the log-spectral amplitudes (LSA). We then propose a novel DCT polarity estimator to be used in combination with the STSA estimator. The clean speech DCT coefficients are modeled by a Gaussian or a Laplace density and the noise DCT coefficients are modeled by a Gaussian density. To assess the enhanced speech, objective and subjective quality measures are employed. Results show that the new estimators performed better and are widely preferred by listeners over the corresponding DFT-based estimators. Moreover, the proposed STSA estimators can be expressed in the closed-form, whereas the DFT-based estimator with super-Gaussian speech prior has no closed-form solutions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.