A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments

Ali Fallah,Steven Van De Par

doi:10.3390/app112210788

Ali Fallah, Steven Van De Par

Open Access

https://doi.org/10.3390/app112210788

Copy DOI

Abstract

Speech intelligibility in public places can be degraded by the environmental noise and reverberation. In this study, a new near-end listening enhancement (NELE) approach is proposed in which using a time varying filter jointly enhances the onsets and reduces the overlap masking. For optimization, some look-ahead in clean speech and prior knowledge of room impulse response (RIR) are required. In this method, by optimizing a defined cost function, the Spectro-Temporal Envelope of reverb speech is optimized to be as close as possible to that of clean speech. In this cost function, onsets of speech are optimized with increased weight. This approach is different from overlap-masking ratio (OMR) and speech enhancement (OE) approaches (Grosse, van de Par, 2017, J. Audio Eng. Soc., Vol. 65 (1/2), pp. 31–41) that only consider previous frames in each time slot for determining the time variant filtering. The SRT measurements show that the new optimization framework enhances the speech intelligibility up to 2 dB more that OE.

Highlights

In conventional speech enhancement methods, the speech signal is recovered from a mixture of reverberation and noise
Speech shaped noise (SSN) and pink noise (PN) are used as the interferers which are convolved with binaural room impulse response (BRIR) and presented at an average level of 65 dB-SPL for left and right ears
The cochleagram of a clean speech from the OLSA corpus and two preprocessed speech signals, one of them preprocessed with the Onset Enhancement (OE) algorithm [16] and another one preprocessed by the proposed algorithm, are depicted in Figure 3a.The weights are calculated for room R4

Summary

Introduction

In conventional speech enhancement methods, the speech signal is recovered from a mixture of reverberation and noise. The intelligibility-improving signal processing approach (IISPA) [11] is another DNN-based method that uses an automatic-speech recognitionbased model of speech perception to optimize different parameters such as band-pass edge frequencies, spectral slope and curvature, and spectral modulation compression or expansion. Note that in these noise-dependent methods, the quality of speech can degrade strongly, in the presence of non-stationary noise.

Proposed NELE Method

Preprocessing

Construction

Optimization Unit

Stimuli

Binaural Room Impulse Responses

Evaluation

Signal Processing Details

Effect of the Algorithm on Signal

Objective

SNR-mr-GPSM for for Specifically

Subjective Evaluation

OE to 3and dB 2for

Discussion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Nov 15, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Noise-management algorithm may improve speech intelligibility in noise
Francis K Kuk ... Carsten Paludan-Müller
The Hearing Journal | VOL. 59
Francis K Kuk, et. al.Francis K Kuk ... Carsten Paludan-Müller
01 Apr 2006
The Hearing Journal | VOL. 59

An MTF-based blind restoration of temporal power envelopes as a front-end processor for automatic speech recognition systems in reverberant environments
Xugang Lu ... Masashi Unoki
The Journal of the Acoustical Society of America | VOL. 123
Xugang Lu, et. al.Xugang Lu ... Masashi Unoki
01 May 2008
The Journal of the Acoustical Society of America | VOL. 123

Speech Enhancement Exploiting the Source-Filter Model

-

12 Apr 2021
12 Apr 2021

Speech dereverberation using NMF with regularized room impulse response
Nikhil Mohanan ... Rajbabu Velmurugan
-
Nikhil Mohanan, et. al.Nikhil Mohanan ... Rajbabu Velmurugan
01 Mar 2017
01 Mar 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences