Abstract

Acoustic source localization in the spherical harmonic domain with reverberation has hitherto not been investigated extensively. When the joint direction-of-arrival (DOA) estimation is treated as a classification task, it needs a lot of different classes and large-scale dataset, which is time-consuming and less cost-effective. Thus, a novel learning-based method is proposed in this paper, which separately estimates azimuths and elevations by two independent convolution neural networks (CNNs) with residual blocks. First, a mapping matrix is adopted to transform the spherical harmonic function into the Kronecker product of two Vandermonde vectors. Based on the special structure, features dependent on azimuth and elevation, respectively, are extracted as inputs to neural networks. This can divide the joint DOA estimation into two sub-tasks completed by two independent CNNs with residual blocks, which can be realized in parallel. Furthermore, the performance of the model learned by CNNs with residual blocks is better than ordinary CNNs. Simulations are conducted both on the simulated and real speech data for evaluating the performance of the proposed method. The results show that the method has higher accuracy. Moreover, the method is time-saving and effectively reduces the computational complexity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call