Abstract
The estimation of the late reverberant spectral variance (LRSV) is of paramount importance in most reverberation suppression algorithms. This letter proposes an improved single-channel LRSV estimator based on Habets LRSV estimator by using an adaptive parameter estimator. Instead of estimating the direct-to-reverberation ratio (DRR), the proposed LRSV estimator directly estimates the parameter κ in a generalized statistical model since the experimental results show that even the κ calculated using measured ground truth DRR may not be the optimal parameter for the LRSV estimator. Experimental results using synthetic reverberant signals demonstrate the superiority of the proposed estimator to conventional approaches.
Highlights
Speech signals received within a room usually contain reverberation which impairs the intelligibility of speech in communication scenarios such as mobile phones and hearing aids
Where ακ(l) = p(l) + ακ(1 − p(l)) is a time-varying smoothing parameter which is adjusted by the frame conditional direct sound presence probability p(l)
We differentiate between the direct sound presence/absence hypotheses, and derive the frame conditional direct sound presence probability p(l) using Bayes rule
Summary
Speech signals received within a room usually contain reverberation which impairs the intelligibility of speech in communication scenarios such as mobile phones and hearing aids. The major part of most reverberation suppression methods is the estimation of late reverberant spectral variance (LRSV), which remains a challenging task due to its high time variability [5]. Habets proposed a single-channel LRSV estimator [3] based on a generalized statistical model [6] to suppress late reverberation, and it still performs outstanding nowadays [5]. Unlike other traditional methods using estimated DRR to calculate κ, the present work aims to propose a blind adaptive κ estimator which can improve the performance of the Habets LRSV estimator and makes it more practical. The proposed κ estimator is evaluated and compared with existed κ estimator [9] and κ calculated using measured ground truth DRR. The quality of the dereverberated speech is evaluated and compared to a method using recursive maximum-sparseness-power-prediction-model (MSPP) [10]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.