Abstract
A new and effective algorithm is proposed in this paper based on Gaussian Mixture Modelling (GMM) and Minimum Mean Square Error (MMSE) criterion for speech enhancement where no assumption is made on the nature or stationarity of the noise. No Voice Activity Detection (VAD) or any other means is used to estimate the input Signal to Noise Ratio (SNR). The mean vectors of the mixture models of spectral magnitudes derived from models of speech and different noise sources power spectra are used to form sets of over-determined system of equations, as many as noise source candidates, whose solutions lead to the MMSE estimations of speech and additive noise spectral magnitudes. The corresponding power spectra are then used for noise suppression by applying Wiener filtering carried out on overlapping frames. The input SNR is estimated and the nature of the noise involved is determined as by-products of the method used. Results are compared with codebook constrained methods that have shown very good results but suffer from long processing times. It is shown that, at the cost of a slight lower improvement in SNR and PESQ score, the new algorithm reduces the computation time to one fifth which makes it suitable for practical applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.