The analysis of conversational signal-to-noise ratios (SNRs) measured in real-world scenarios can provide vital insight into people's communicative strategies and difficulties and guide development of hearing devices. However, measuring SNRs accurately and realistically is challenging in typical recording conditions, where only a mixture of sound sources is captured. This study introduces a novel method for realistic in situ SNR estimation, where the speech signal of a person in natural conversation is captured by a cheek-mounted microphone, adjusted for free-field conditions, and convolved with a measured impulse response to estimate the clean speech component at the receiver. A microphone near the receiver computes the noise-only component by applying a voice activity detector. Obtained SNR values are analyzed using in situ recordings of a real-world workspace meeting. It is shown that the temporal resolution is increased, and fluctuations in the speech level are more accurately tracked compared to a typical spectral-subtraction-derived method. The application of the proposed SNR estimation method may be valuable for compensation procedures in hearing instruments that take conversational dynamics into account.
Read full abstract