Noise reduction for speech applications is often formulated as a digital filtering problem, where the clean speech estimate is obtained by passing the noisy speech through a linear filter/transform. With such a formulation, the core issue of noise reduction becomes how to design an optimal filter (based on the statistics of the speech and noise signals) that can significantly suppress noise without introducing perceptually noticeable speech distortion. The optimal filters can be designed either in the time or in a transform domain. The advantage of working in a transform space is that, if the transform is selected properly, the speech and noise signals may be better separated in that space, thereby enabling better filter estimation and noise reduction performance. Although many different transforms exist, most efforts in the field of noise reduction have been focused only on the Fourier and Karhunen-Loeve transforms. Even with these two, no formal study has been carried out to investigate which transform can outperform the other. In this paper, we reformulate the noise reduction problem into a more generalized transform domain. We will show some of the advantages of working in this generalized domain, such as 1) different transforms can be used to replace each other without any requirement to change the algorithm (optimal filter) formulation, and 2) it is easier to fairly compare different transforms for their noise reduction performance. We will also address how to design different optimal and suboptimal filters in such a generalized transform domain.
Read full abstract