Abstract

Drone-embedded sound source localization (SSL) has interesting application perspective in challenging search and rescue scenarios due to bad lighting conditions or occlusions. However, the problem gets complicated by severe drone ego-noise that may result in negative signal-to-noise ratios in the recorded microphone signals. In this paper, we present our work on drone-embedded SSL using recordings from an 8-channel cube-shaped microphone array embedded in an unmanned aerial vehicle (UAV). We use angular spectrum-based TDOA (time difference of arrival) estimation methods such as generalized cross-correlation phase-transform (GCC-PHAT), minimum-variance-distortion-less-response (MVDR) as baseline, which are state-of-the-art techniques for SSL. Though we improve the baseline method by reducing ego-noise using speed correlated harmonics cancellation (SCHC) technique, our main focus is to utilize deep learning techniques to solve this challenging problem. Here, we propose an end-to-end deep learning model, called DOANet, for SSL. DOANet is based on a one-dimensional dilated convolutional neural network that computes the azimuth and elevation angles of the target sound source from the raw audio signal. The advantage of using DOANet is that it does not require any hand-crafted audio features or ego-noise reduction for DOA estimation. We then evaluate the SSL performance using the proposed and baseline methods and find that the DOANet shows promising results compared to both the angular spectrum methods with and without SCHC. To evaluate the different methods, we also introduce a well-known parameter—area under the curve (AUC) of cumulative histogram plots of angular deviations—as a performance indicator which, to our knowledge, has not been used as a performance indicator for this sort of problem before.

Highlights

  • Unmanned aerial vehicles (UAVs), ubiquitously known as drones, have found great use in a wide range of applications—from casual use in photography to search and rescue operations where human lives are at stake

  • We present our method for sound source localization (SSL), which was developed for the IEEE Signal Processing Cup (SP Cup) 2019 titled “Search and Rescue with drone-embedded SSL” [20]

  • All recordings were made with the UAV flying in an indoor environment; as such, the scope of our experiments described in this article was limited to indoor environments

Read more

Summary

Introduction

Unmanned aerial vehicles (UAVs), ubiquitously known as drones, have found great use in a wide range of applications—from casual use in photography to search and rescue operations where human lives are at stake. For noise-robust SSL, a generalized eigenvalue decomposition-based multiple signal classification (GEVD-MUSIC) algorithm combined with an adaptive estimation method of the noise correlation matrix was proposed by [8]. A method for combining information from the GCC between multiple microphone inputs, the dynamics of the UAV, and the Doppler shift in sound frequency due to motion was proposed by [3]. Since the UAV is a remote platform with limited computational capability, SSL algorithms must be computationally efficient so that sound sources can be triangulated in real time. Such an algorithm was proposed by [4] which involved a modified version of the MUSIC algorithm based on incremental generalized singular value decomposition (iGSVD-MUSIC). In order to locate and track a moving sound source, an approach involving time-frequency spatial filtering combined with a particle filter was described to perform well under noisy conditions by [6]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.