Abstract

This paper presents a system aiming at joint dereverberation and noise reduction by applying a combination of a beamformer with a single-channel spectral enhancement scheme. First, a minimum variance distortionless response beamformer with an online estimated noise coherence matrix is used to suppress noise and reverberation. The output of this beamformer is then processed by a single-channel spectral enhancement scheme, based on statistical room acoustics, minimum statistics, and temporal cepstrum smoothing, to suppress residual noise and reverberation. The evaluation is conducted using the REVERB challenge corpus, designed to evaluate speech enhancement algorithms in the presence of both reverberation and noise. The proposed system is evaluated using instrumental speech quality measures, the performance of an automatic speech recognition system, and a subjective evaluation of the speech quality based on a MUSHRA test. The performance achieved by beamforming, single-channel spectral enhancement, and their combination are compared, and experimental results show that the proposed system is effective in suppressing both reverberation and noise while improving the speech quality. The achieved improvements are particularly significant in conditions with high reverberation times.

Highlights

  • In many speech communication applications, such as voice-controlled systems or hearing aids, distant microphones are used to record a target speaker

  • 6 Results The performance of the proposed system for each condition is evaluated in terms of instrumental speech quality measures

  • The performance of the combined scheme is compared to the performance when applying only the single-channel spectral enhancement scheme to the first microphone signal and when applying only the minimum variance distortionless response (MVDR) beamformer to the multichannel input

Read more

Summary

Introduction

In many speech communication applications, such as voice-controlled systems or hearing aids, distant microphones are used to record a target speaker. The microphone signals are often corrupted by both reverberation and noise, resulting in a degraded speech quality and speech intelligibility, as well as in a reduced performance of automatic speech recognition (ASR) systems. Several algorithms have been proposed in the literature to deal with these issues (cf [1,2,3] and the references therein). This paper extends the description and evaluation of the system proposed by the authors in [4], which consists of a commonly used combination of a minimum variance distortionless response (MVDR) beamformer with a single-channel spectral enhancement

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call