Abstract
Many distributed speech enhancement algorithms for wireless acoustic sensor networks (WASN) require multi-speaker voice activity detection (VAD) to estimate the speech and noise covariances. We propose a robust sparsity-constrained non-negative energy source separation algorithm applied to multi-speaker to achieve VAD in centralized and distributed network configurations. The sparsity of the speech sources is exploited via a non-negative energy unmixing algorithm that accommodates an l 1 penalized singular value decomposition to extract features for the VAD task. Detection is simplified to finding the non-zero elements of the separated energies. Robustness is achieved by integrating a t v M-estimator of the covariance matrix in the multi-source separation. The distributed method neither requires a fusion center nor prior knowledge about the node positions, microphone array orientations or the number of observed sources. The proposed VAD is evaluated with a practical distributed speech enhancement scenario in a WASN and significantly improves the node-specific signal estimation compared to an existing approach.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.