This paper proposes a low-variance and adaptive-bandwidth spectral estimator for spectral subtraction, which is based on the two-stage spectral estimation (TSSE) and the modified cepstrum thresholding (MCT). In the first stage, both the raw periodogram and the noise power spectral density (NPSD) are smoothed over frequency based on the structure of the NPSD. The second stage is applied to distinguish each harmonic component for speech signals, which is based on the structure of the speech spectrum. The TSSE could provide a low-variance and adaptive-bandwidth spectral estimator for both noise and speech since the TSSE considers both the structure of the NPSD and that of the speech spectrum. Although spectral subtraction based on the TSSE (TSSE-SS) could solve the annoying musical noise problem, but the TSSE-SS could not suppress the non-stationary noise effectively, so the MCT is applied to the TSSE-SS to further reduce the non-stationary noise components. Experimental results show that the proposed algorithm has higher signal-to-noise-ratio improvement and higher PESQ scores than conventional spectral subtraction algorithms.
Read full abstract