Abstract
This study focuses on an unexplored aspect of the performance of algorithms for blind reverberation time (T) estimation – on the effect that speech signal’s phonetic content has on the value of the estimate ofTthat is obtained from the reverberant version of that signal. To this end, the performance of three algorithms is assessed on a set of logatome recordings artificially reverberated with room impulse responses from four rooms, with theirT20value in the [0.18, 0.55] s interval. Analyses of variance showed that the null hypotheses of equal means of estimation errors can be rejected at the significance level of 0.05 for the interaction terms between the factors “vowel”, “consonant”, and “room”, while the results of Tukey’s multiple comparison procedure revealed that there are both some similarities in the behaviour of the algorithms and some differences, where the latter are stemming from the differences in the details of algorithms’ implementation such as the number of frequency bands and whetherTis estimated continuously or only on the selected, the so-called speech decay, segments of the signal.
Highlights
Reverberation time (T) is one of the most important objective measures indicating the severity of reverberation in an enclosure
This study focuses on an unexplored aspect of the performance of algorithms for blind reverberation time (T) estimation – on the effect that speech signal’s phonetic content has on the value of the estimate of T that is obtained from the reverberant version of that signal
The results presented in this study demonstrate that, in addition to their well explored sensitivity to background noise, the values of T estimates of the state-of-the-art algorithms for blind reverberation time estimation are strongly influenced by the phonemes present in the speech material
Summary
Reverberation time (T) is one of the most important objective measures indicating the severity of reverberation in an enclosure. It represents the value of time it takes for a steady-state sound energy level to gradually decay by 60 dB after an abrupt cessation of the sound source [1]. Due to detrimental effects reverberation has on speech signals captured when microphones (e.g., the ones in personal electronic devices such as mobile phones, videoconference systems, and hearing aids) are positioned in the far field of a sound source (e.g., a person speaking), a number of algorithms for reverberation time estimation from the received reverberant speech signals – the so-called blind algorithms – have been developed in the last two decades. An estimate of reverberation time obtained with these algorithms can be used in the process of speech dereverberation for either speech enhancement or speech/speaker recognition purposes [4,5,6].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.