Abstract

Spatial filtering of speech based on neural networks has been widely studied. However, existing approaches focus on improving signal extraction or separation performance, and how to define the signal in the direction-of-interest (DOI) for spatial filtering has not been investigated in detail. This study proposes a method to train neural networks for extracting directivity components of speech signals in the DOI. To this end, we formulate the problem by defining the DOI and its corresponding desired signal in a reverberant environment. Moreover, we demonstrate an on-the-fly training data generation procedure to feed the spatially diverse data to train the networks. The proposed method was evaluated with regard to spatial speech extraction and localization performance. In particular, it has been confirmed that the network trained with the proposed method using simulated datasets also functions for real recordings.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call