Robust Sound Source Localization Using Convolutional Neural Network Based on Microphone Array

Xiaoyan Zhao,Lin Zhou,Yuxiao Qi,Ying Tong,Jingang Shi

doi:10.32604/iasc.2021.018823

Abstract

In order to improve the performance of microphone array-based sound source localization (SSL), a robust SSL algorithm using convolutional neural network (CNN) is proposed in this paper. The Gammatone sub-band steered response power-phase transform (SRP-PHAT) spatial spectrum is adopted as the localization cue due to its feature correlation of consecutive sub-bands. Since CNN has the “weight sharing” characteristics and the advantage of processing tensor data, it is adopted to extract spatial location information from the localization cues. The Gammatone sub-band SRP-PHAT spatial spectrum are calculated through the microphone signals decomposed in frequency domain by Gammatone filters bank. The proposed algorithm takes a two-dimensional feature matrix which is assembled from Gammatone sub-band SRP-PHAT spatial spectrum within a frame as CNN input. Taking the advantage of powerful modeling capability of CNN, the two-dimensional feature matrices in diverse environments are used together to train the CNN model which reflects mapping regularity between the feature matrix and the azimuth of sound source. The estimated azimuth of the testing signal is predicted through the trained CNN model. Experimental results show the superiority of the proposed algorithm in SSL problem, it achieves significantly improved localization performance and capacity of robustness and generality in various acoustic environments.

Highlights

The aim of microphone array-based sound source localization (SSL) is to determine the location information by applying a series of signal processing on multichannel received signals
Where P(k) is the feature matrix of kth frame, and Pi(rl, k) is the ith Gammatone sub-band steered response power (SRP)-PHAT at rl in kth frame which is calculated by Eq (4), I is the channel number of Gammatone filter, L is the number of steering positions
The performance of the proposed algorithm is compared with two related algorithms, namely the SRP-PHAT [11] and SSL based on deep neural network (SSL-deep neural networks (DNN)) [27]

Summary

Introduction

The aim of microphone array-based sound source localization (SSL) is to determine the location information by applying a series of signal processing on multichannel received signals. The second way of applying deep learning to SSL task has been more widely studied, and a variety of input features types are involved by the approaches, such as inter-aural level difference (ILD), inter-aural phase difference (IPD), cross-correlation function (CCF), generalized cross correlation (GCC) and so on. The methods in [20,21] jointed ILDs and CCF as input features, an SSL algorithm fusing deep and convolutional neural network is presented in [20], and a method based on DNN and cluster analysis is present in [21] to improve the localization performance in the mismatched HRTF condition. The approach in [25] taken the cross correlations in different frequency bands on mel scale as input features, and trained the CNN model to estimate the map of sound source direction of arrival.

System Overview

The Architecture of CNN

The Training of CNN

Simulation Setup

Evaluation in Trained Environments

Evaluation in Untrained Environments

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Intelligent Automation & Soft Computing	Publication Date: Jan 1, 2021
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Robust Sound Source Localization Using Convolutional Neural Network Based on Microphone Array

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Automation & Soft Computing

Lead the way for us

Similar Papers

Performance Analysis of Various Trained CNN Models on Gujarati Script
Parantap Vakharwala ... Urvashi Pawar
-
Parantap Vakharwala, et. al.Parantap Vakharwala ... Urvashi Pawar
30 Oct 2020
30 Oct 2020

Microphone Array-Based Sound Source Localization Using Convolutional Residual Network
Ziyi Wang ... Xiaoyan Zhao
Journal of New Media | VOL. 4
Ziyi Wang, et. al.Ziyi Wang ... Xiaoyan Zhao
01 Jan 2021
Journal of New Media | VOL. 4

Head‐related transfer function–reserved time‐frequency masking for robust binaural sound source localization
Hong Liu ... Yang Chen
CAAI Transactions on Intelligence Technology | VOL. 7
Hong Liu, et. al.Hong Liu ... Yang Chen
02 Mar 2021
CAAI Transactions on Intelligence Technology | VOL. 7

Time-Frequency distributions of heart sound signals: A Comparative study using convolutional neural networks
Xinqi Bao ... Ernest N Kamavuako
Biomedical Engineering Advances | VOL. 5
Xinqi Bao, et. al.Xinqi Bao ... Ernest N Kamavuako
25 May 2023
Biomedical Engineering Advances | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust Sound Source Localization Using Convolutional Neural Network Based on Microphone Array

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Automation &amp; Soft Computing

More From: Intelligent Automation & Soft Computing