Abstract

This study investigates the relationship between the intelligibility and quality of modified speech in noise and in quiet. Speech signals were processed by seven algorithms designed to increase speech intelligibility in noise without altering speech intensity. In three noise maskers, including both stationary and fluctuating noise at two signal-to-noise ratios (SNR), listeners identified keywords from unmodified or modified sentences. The intelligibility performance of each type of speech was measured as the listeners’ word recognition rate in each condition, while the quality was rated as a mean opinion score. In quiet, only the perceptual quality of each type of speech was assessed. The results suggest that when listening in noise, modification performance on improving intelligibility is more important than its potential negative impact on speech quality. However, when listening in quiet or at SNRs in which intelligibility is no longer an issue to listeners, the impact to speech quality due to modification becomes a concern.

Highlights

  • During the last decade, a considerable number of speech modification algorithms have been proposed in order to boost speech intelligibility in adverse listening environments while maintaining a constant input-output speech intensity

  • While the majority of modification algorithms operate in the frequency domain, such as enhancing frequency components which are important to speech intelligibility in noise [6,7,8] and boosting certain spectral regions based on optimising objective intelligibility metrics [9,10,11,12], other algorithms make changes in the time domain, including introducing pauses into speech and speeding up or slowing down part of the speech to avoid a temporal clash between the speech and masker [10,13]

  • Listeners tended to rate the quality of the modified speech rather differently across modifications, and across the noise maskers in the presence of which the modification was performed for Selboost

Read more

Summary

Introduction

A considerable number of speech modification algorithms have been proposed in order to boost speech intelligibility in adverse listening environments while maintaining a constant input-output speech intensity. While the majority of modification algorithms operate in the frequency domain, such as enhancing frequency components which are important to speech intelligibility in noise [6,7,8] and boosting certain spectral regions based on optimising objective intelligibility metrics [9,10,11,12], other algorithms make changes in the time domain, including introducing pauses into speech and speeding up or slowing down part of the speech to avoid a temporal clash between the speech and masker [10,13] Approaches combining both spectral and temporal modifications have achieved better performance than either of the approaches alone [14,15,16].

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call