Abstract

Room acoustic parameters that characterize acoustic environments can help to improve signal enhancement algorithms such as for dereverberation, or automatic speech recognition by adapting models to the current parameter set. The reverberation time (RT) and the early-to-late reverberation ratio (ELR) are two key parameters. In this paper, we propose a blind ROom Parameter Estimator (ROPE) based on an artificial neural network that learns the mapping to discrete ranges of the RT and the ELR from single-microphone speech signals. Auditory-inspired acoustic features are used as neural network input, which are generated by a temporal modulation filter bank applied to the speech time-frequency representation. ROPE performance is analyzed in various reverberant environments in both clean and noisy conditions for both fullband and subband RT and ELR estimations. The importance of specific temporal modulation frequencies is analyzed by evaluating the contribution of individual filters to the ROPE performance. Experimental results show that ROPE is robust against different variations caused by room impulse responses (measured versus simulated), mismatched noise levels, and speech variability reflected through different corpora. Compared to state-of-the-art algorithms that were tested in the acoustic characterisation of environments (ACE) challenge, the ROPE model is the only one that is among the best for all individual tasks (RT and ELR estimation from fullband and subband signals). Improved fullband estimations are even obtained by ROPE when integrating speech-related frequency subbands. Furthermore, the model requires the least computational resources with a real time factor that is at least two times faster than competing algorithms. Results are achieved with an average observation window of 3 s, which is important for real-time applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.