Partially Shared Deep Neural Network in sound source separation and identification using a UAV-embedded microphone array

Takayuki Morito,Osamu Sugiyama,Kazuhiro Nakadai,Ryosuke Kojima

doi:10.1109/iros.2016.7759215

Abstract

This paper addresses sound source separation and identification for noise-contaminated acoustic signals recorded with a microphone array embedded in an Unmanned Aerial Vehicle (UAV), aiming at people's voice detection quickly and widely in a disaster situation. The key approach to achieve this is Deep Neural Network (DNN), but it is well known that training a DNN needs a huge dataset to improve its performance. In a practical application, building such a dataset is not often realistic owing to the cost of manual data annotation. Therefore, we propose a Partially-Shared Deep Neural Network (PS-DNN) which can learn multiple tasks at the same time with a small amount of annotated data. Preliminary results show that the PS-DNN outperforms conventional DNN-based approaches which require fully-annotated data in training in terms of identification accuracy. In addition, it maintains performance even when noise-suppressed signals are used for sound source separation training, and partially annotated data is used for sound source identification training.

Full Text