Abstract

Noise in the environment can decrease the quality and intelligibility of a telephone conversation. This study focuses on the intelligibility enhancement of narrowband telephone speech in a near-end noise scenario using a postprocessing method based on normal-to-Lombard spectral tilt conversion. The proposed technique uses nonparallel, conversational normal, and Lombard speech together with Gaussian process regression in order to mimic the flattening of the spectral tilt that occurs in the production of natural speech in noisy conditions. The performance of the proposed method was evaluated in comparison to two reference methods, a fixed high-pass filter, and a baseline spectral tilt conversion, as well as in comparison to unprocessed speech in terms of intelligibility and listening preference in noisy conditions and in terms of pressedness in silent conditions. The results indicate that while the proposed technique provides a similar benefit in terms of intelligibility as fixed high-pass filtering, it is also able to produce a notable increase in pressedness. This suggests that the developed processing of the spectral tilt can compete with fixed high-pass filtering in intelligibility enhancement, but it is also able to convert speech to become perceptually closer to natural Lombard speech.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call