Distributed Speech Enhancement in Wireless Acoustic Sensor Networks

Yuan Zeng

doi:10.4233/uuid:9b1c231f-a4bd-4f5e-b49a-88bd465faf82

Yuan Zeng

PDF Available

https://doi.org/10.4233/uuid:9b1c231f-a4bd-4f5e-b49a-88bd465faf82

Copy DOI

Export

Save

Cite

Publication Date: Jun 18, 2015

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

In digital speech communication applications like hands-free mobile telephony, hearing aids and human-to-computer communication systems, the recorded speech signals are typically corrupted by background noise. As a result, their quality and intelligibility can get severely degraded. Traditional noise reduction approaches process signals recorded by microphone arrays using centralized beamforming technologies. Recent advances in micro-electro-mechanical systems and wireless communications enable the development of wireless sensor networks (WSNs), where low-cost, low-power and multi-functional wireless sensing devices are connected via wireless links. Compared with conventional localized and regularly arranged microphone arrays, wireless sensor nodes can be randomly placed in environments and thus cover a larger spatial field and yield more information on the observed signals. This thesis explores some problems on multi-microphone speech enhancement for wireless acoustic sensor networks (WASNs), such as distributed noise reduction processing, clock synchronization and privacy preservation. First, we develop a distributed delay-and-sum beamformer (DDSB) for speech enhancement in WASNs. Due to limited power of each wireless device, signal processing algorithms with low computational complexity and low communication cost are preferred in WASNs. Distributed signal processing allows that each node only communicates with its neighboring nodes and performs local processing, where communication load and computational complexity are distributed over all nodes in the network. Without central processor and network topology constraint, the DDSB algorithm estimates the desired speech signal via local processing and local communication. The DDSB algorithm is based on an iterative scheme. More specifically, in each iteration, pairs of neighboring nodes update their estimates according to the principle of traditional delay-and-sum (DSB) beamformer. The estimation of the DDSB converges asymptotically to the optimal solution of the centralized beamformer. However, experimental study indicates that the noise reduction performance of the DDSB is at the expense of a higher communication cost, which can be a serious drawback in practical applications. Therefore, in the second part of this thesis, a clique-based distributed beamformer (CbDB) has been proposed to reduce communication costs of the original DDSB algorithm. In the CbDB, nodes in two neighboring non-overlapping cliques update their estimates simultaneously per iteration. Since each non-overlapping clique consists of multiple nodes, the CbDB allows more nodes to update their estimates and leads to lower communication costs than the original DDSB algorithm. Furthermore, theoretical and experimental studies have shown that the CbDB converges to the centralized beamformer and is more robust for sensor nodes failures in WASNs. In the third part of this thesis, we propose a privacy preserving minimum variance distortionless response (MVDR) beamformer for speech enhancement in WASNs. Different wireless devices in WASNs generally belong to different users. We consider a scenario where a user joins the WASN and estimates his desired source via the WASN, but wants to keep his source of interest private. To introduce a distributed MVDR beamformer in such scenario, a distributed approach is first proposed for recursively estimation of the inverse of the correlation matrix in randomly connected WASNs. This distributed approach is based on the fact that using the Sherman-Morrison formula, estimation of the inverse of the correlation matrix can be seen as a consensus problem. By hiding the steering vector, the privacy preserving MVDR beamformer can reach the same noise reduction performance as its centralized version. In the final part of this thesis, we investigate clock synchronization problems for multi-microphone speech enhancement in WASNs. Each wireless device in WASNs is equipped with an independent clock oscillator, and therefore clock differences are inevitable. However, clock differences between capturing devices will cause signal drift and lead to severe performance degradation of multi-microphone noise reduction algorithms. We provide theoretical analysis of the effect of clock synchronization problems on beamforming technologies and evaluate the use of three different clock synchronization algorithms in the context of multi-microphone noise reduction. Our experimental study shows that the achieved accuracy of the three clock synchronization algorithms enables sufficient accuracy of clock synchronization for the MVDR beamformer in ideal scenarios. However, in practical scenarios with measurement uncertainty or noise, the output of the MVDR beamformer with time-stamp based clock synchronization algorithms gets degraded, while the accuracy of signal based clock synchronization algorithms is still enough for the MVDR beamformer, albeit at a much higher communication cost.

Full Text