Abstract

This paper addresses sound source separation and identification for noise-contaminated acoustic signals recorded with a microphone array embedded in an Unmanned Aerial Vehicle (UAV), aiming at people's voice detection quickly and widely in a disaster situation. The key approach to achieve this is Deep Neural Network (DNN), but it is well known that training a DNN needs a huge dataset to improve its performance. In a practical application, building such a dataset is not often realistic owing to the cost of manual data annotation. Therefore, we propose a Partially-Shared Deep Neural Network (PS-DNN) which can learn multiple tasks at the same time with a small amount of annotated data. Preliminary results show that the PS-DNN outperforms conventional DNN-based approaches which require fully-annotated data in training in terms of identification accuracy. In addition, it maintains performance even when noise-suppressed signals are used for sound source separation training, and partially annotated data is used for sound source identification training.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call