Abstract
Frequency and location dependent components in the speech signal can be decoupled by signal processing in the spherical harmonic domain. In this paper, a sparsity based method for joint source localization and separation method using online dictionary learning is proposed. Conventional sparsity based methods utilize an overcomplete dictionary to find a sparse linear combination of dictionary atoms. Online dictionary learning discussed herein, addresses the joint localization and separation problem by learning the dictionary atoms based on stochastic approximation. The location dependent terms present in the dictionary atoms at various frequencies are then clustered to find a robust estimate of number of sources and their locations. Using these estimates, the sources are separated from the mixture. Experiments on speech source localization and separation are conducted at various SNR. Performance evaluation scores like RMSE, log spectral distance and perceptual mean opinion scores indicate reasonable improvement over conventional methods for speech source separation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.