Abstract

In this research, a novel sound source localization model is introduced that integrates a convolutional neural network with a regression model (CNN-R) to estimate the sound source angle and distance based on the acoustic characteristics of the interaural phase difference (IPD). The IPD features of the sound signal are firstly extracted from time-frequency domain by short-time Fourier transform (STFT). Then, the IPD features map is fed to the CNN-R model as an image for sound source localization. The Pyroomacoustics platform and the multichannel impulse response database (MIRD) are used to generate both simulated and real room impulse response (RIR) datasets. The experimental results show that an average accuracy of 98.96% and 98.31% are achieved by the proposed CNN-R for angle and distance estimations in the simulation scenario at SNR = 30 dB and RT60 = 0.16 s, respectively. Moreover, in the real environment, the average accuracies of the angle and distance estimations are 99.85% and 99.38% at SNR = 30 dB and RT60 = 0.16 s, respectively. The performance obtained in both scenarios is superior to that of existing models, indicating the potential of the proposed CNN-R model for real-life applications.

Highlights

  • An original sound source localization model was developed by combining a convolutional neural network and a regression model (CNN-R)

  • The sound signals were transformed into time-frequency signals through short-time Fourier transform (STFT), and interaural phase difference (IPD) feature maps were calculated from the time-frequency signals

  • The evaluation metrics of Acc., mean absolute error (MAE), and RMSE were used to evaluate the performance of the proposed model

Read more

Summary

Introduction

GPS accuracy is degraded when it is used in indoor environments due to obstacles blocking the signal’s propagation [5,6]. A number of technologies, such as infrared (IR), Bluetooth, and Wi-Fi, have been developed to address the challenge of indoor positioning. These technologies have become widely used for indoor localization and positioning in recent years [7]. The signals of indoor positioning technologies must be propagated in LOS conditions in order to produce accurate location estimates [9]. Bluetooth and Wi-Fi have the advantage of strong penetrating power, which can penetrate through indoor obstacles [11,12]. Sound source localization (SSL) has attracted much attention in recent years [16,17,18]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call