Abstract

In this paper, we propose a directional signal extraction network (DSENet). DSENet is a low-latency, real-time neural network that, given a reverberant mixture of signals captured by a microphone array, aims at extracting the reverberant signal whose source is located within a directional region of interest. If there are multiple sources situated within the directional region of interest, DSENet will aim at extracting a combination of their reverberant signals. As such, the formulation of DSENet circumvents the well-known crosstalk problem in beamforming while providing an alternative and perhaps more practical approach to other spatially constrained signal extraction methods proposed in the literature. DSENet is based on a computationally efficient and low-distortion linear model formulated in the time domain. As a result, an important application of our work is hearing improvement on edge devices. Simulation results show that DSENet outperforms oracle beamformers, as well as state-of-the-art in low-latency causal speech separation, while incurring a system latency of only 4 ms. Additionally, DSENet has been successfully deployed as a real-time application on a smartphone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call