Abstract
Instrumental quality prediction of speech processed by enhancement algorithms has become crucial with the proliferation of far-field speech applications. To date, while several instrumental measures have been proposed and standardized, their performance under a wide range of acoustic conditions and enhancement algorithms is still unknown. This paper aims to fill this gap. Specifically, the performance of eleven instrumental measures are compared; four are non-intrusive measures, i.e. not requiring a clean reference signal, and seven intrusive. Simulated and recorded speech under four different acoustic conditions involving varying levels of reverberation and noise are explored, as well as processed by three single- and multi-channel enhancement algorithms. Experimental results show that a recently developed non-intrusive measure called SRMR norm outperforms all other considered measures in terms of overall quality prediction. The well-known PESQ measure, in turn, showed to better predict the perceived amount of reverberation, followed by SRMR norm . These results are promising, as the latter measure does not require access to a clean reference signal, thus has the potential to be used for enhancement algorithm optimization in real-time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.