Abstract
This paper addresses sound source separation and identification for noise-contaminated acoustic signals recorded with a microphone array embedded in an Unmanned Aerial Vehicle (UAV), aiming at people's voice detection quickly and widely in a disaster situation. The key approach to achieve this is Deep Neural Network (DNN), but it is well known that training a DNN needs a huge dataset to improve its performance. In a practical application, building such a dataset is not often realistic owing to the cost of manual data annotation. Therefore, we propose a Partially-Shared Deep Neural Network (PS-DNN) which can learn multiple tasks at the same time with a small amount of annotated data. Preliminary results show that the PS-DNN outperforms conventional DNN-based approaches which require fully-annotated data in training in terms of identification accuracy. In addition, it maintains performance even when noise-suppressed signals are used for sound source separation training, and partially annotated data is used for sound source identification training.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have