Itakura-Saito Measure Research Articles

Abstract: To enhance the speech perception of hearing aid users in noisy environment, most hearing aid devicesadopt various beamforming algorithms such as the first-order differential microphone (DM1) and the two-stage direc-tional microphone (DM2) algorithms that maintain sounds from the direction of the interlocutor and reduce the ambi-ent sounds from the other directions. However, these conventional algorithms represent poor directionality abilityin low frequency area. Therefore, to enhance the speech perception of hearing aid uses in low frequency range, ourgroup had suggested a fractional delay subtraction and integration (FDSI) algorithm and estimated its theoretical per-formance using computer simulation in previous article. In this study, we performed a KEMAR test in non-rever-berant room that compares the performance of DM1, DM2, broadband beamforming (BBF), and proposed FDSIalgorithms using several objective indices such as a signal-to-noise ratio (SNR) improvement, a segmental SNR (seg-SNR) improvement, a perceptual evaluation of speech quality (PESQ), and an Itakura-Saito measure (IS). Experi-mental results showed that the performance of the FDSI algorithm was −3.26-7.16 dB in SNR improvement, −1.94-5.41 dB in segSNR improvement, 1.49-2.79 in PESQ, and 0.79-3.59 in IS, which demonstrated that the FDSI algo-rithm showed the highest improvement of SNR and segSNR, and the lowest IS. We believe that the proposed FDSIalgorithm has a potential as a beamformer for digital hearing aid devices.Key words: hearing aids, directional microphone, beamforming

There are many situations where non-real-time speech enhancement is required. For such applications, employing any available a priori knowledge can lead to more effective enhancement solutions. In this study, a novel text-directed speech enhancement algorithm is developed for usage in non-real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech into regions of broad phoneme classifications. Classes considered include stops, fricatives, affricates, nasals, vowels, semivowels, diphthongs and silence. These partitions are then used to direct a new vector quantizer based enhancement scheme in which phone-class directed constraints are applied to improve speech quality. The proposed algorithm is evaluated using both objective as well as subjective quality assessment techniques. It is shown that the text-directed approach improves the quality of the degraded speech over a broad range of noise sources (i.e., flat communications channel noise, aircraft cockpit noise, helicopter fly-by noise, and automobile highway noise) and over a broad range of signal-to-noise ratios (i.e., 10, 5, 0 and −5 dB). In each case, the proposed method is shown consistently to exhibit improved objective quality over linear and generalized spectral subtraction, as well as the Auto-LSP constrained iterative enhancement method using the Itakura-Saito measure and a 100-sentence evaluation speech corpus. Subjective quality assessment was conducted in the form of an A-B comparison test. Results of these evaluations demonstrate that, for wideband noise distortions, the proposed algorithm is preferred over the unprocessed noisy speech more than 2 to 1, while the proposed algorithm is preferred over spectral subtraction by more than 3 to 1.

Itakura-Saito Measure Research Articles

Related Topics

Articles published on Itakura-Saito Measure

Learning HMM State Sequences from Phonemes for Speech Synthesis

KEMAR 마네킹을 이용한 단이 보청기용 FDSI 빔포밍 알고리즘의 정량적 평가

A Combined Psychoacoustic Approach for Stereo Acoustic Echo Cancellation

Text-directed speech enhancement employing phone class parsing and feature map constrained vector quantization

A frequency weighted Itakura-Saito spectral distance measure

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Itakura-Saito Measure Research Articles

Related Topics

Articles published on Itakura-Saito Measure

Learning HMM State Sequences from Phonemes for Speech Synthesis

KEMAR 마네킹을 이용한 단이 보청기용 FDSI 빔포밍 알고리즘의 정량적 평가

A Combined Psychoacoustic Approach for Stereo Acoustic Echo Cancellation

Text-directed speech enhancement employing phone class parsing and feature map constrained vector quantization

A frequency weighted Itakura-Saito spectral distance measure