Abstract

There are many situations where non-real-time speech enhancement is required. For such applications, employing any available a priori knowledge can lead to more effective enhancement solutions. In this study, a novel text-directed speech enhancement algorithm is developed for usage in non-real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech into regions of broad phoneme classifications. Classes considered include stops, fricatives, affricates, nasals, vowels, semivowels, diphthongs and silence. These partitions are then used to direct a new vector quantizer based enhancement scheme in which phone-class directed constraints are applied to improve speech quality. The proposed algorithm is evaluated using both objective as well as subjective quality assessment techniques. It is shown that the text-directed approach improves the quality of the degraded speech over a broad range of noise sources (i.e., flat communications channel noise, aircraft cockpit noise, helicopter fly-by noise, and automobile highway noise) and over a broad range of signal-to-noise ratios (i.e., 10, 5, 0 and −5 dB). In each case, the proposed method is shown consistently to exhibit improved objective quality over linear and generalized spectral subtraction, as well as the Auto-LSP constrained iterative enhancement method using the Itakura-Saito measure and a 100-sentence evaluation speech corpus. Subjective quality assessment was conducted in the form of an A-B comparison test. Results of these evaluations demonstrate that, for wideband noise distortions, the proposed algorithm is preferred over the unprocessed noisy speech more than 2 to 1, while the proposed algorithm is preferred over spectral subtraction by more than 3 to 1.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.