Abstract

This paper deals with speech enhancement in dual-microphone smartphones using beamforming along with postfiltering techniques. The performance of these algorithms relies on a good estimation of the acoustic channel and speech and noise statistics. In this work we present a speech enhancement system that combines the estimation of the relative transfer function (RTF) between microphones using an extended Kalman filter framework with a novel speech presence probability estimator intended to track the noise statistics’ variability. The available dual-channel information is exploited to obtain more reliable estimates of clean speech statistics. Noise reduction is further improved by means of postfiltering techniques that take advantage of the speech presence estimation. Our proposal is evaluated in different reverberant and noisy environments when the smartphone is used in both close-talk and far-talk positions. The experimental results show that our system achieves improvements in terms of noise reduction, low speech distortion and better speech intelligibility compared to other state-of-the-art approaches.

Highlights

  • Speech-related services are pervasively available on mobile devices such as smartphones or tablets.reverberant and noisy environments, where these devices are frequently used, often degrade speech signal quality and/or intelligibility [1]

  • The noise statistics and the noisy observations, we proposed in [15] the tracking of the relative transfer function (RTF) using an extended Kalman filter, showing a better estimation performance in comparison with other state-of-the-art approaches

  • This oracle RTF was obtained from the clean speech signals using Equation (3) for time-frequency bins where speech presence was detected, while the RTF of the previous frame was reused for the remaining ones

Read more

Summary

Introduction

Speech-related services are pervasively available on mobile devices such as smartphones or tablets.reverberant and noisy environments, where these devices are frequently used, often degrade speech signal quality and/or intelligibility [1]. Many current devices include several microphones, so that multi-channel speech processing techniques can be applied to reduce the distortions, which improves the noise reduction performance compared to single-channel approaches. The most common multi-channel speech processing technique is beamforming [2], which applies spatial filtering to the noisy speech signals captured by several microphones One of these beamformers is the well-known Minimum Variance Distortionless Response (MVDR) beamformer [3], which has the advantage of being able to reduce the noise power without introducing speech distortion

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.