Abstract

Sound Direction Detection (SDD) is one of the unique features of human beings but it is a very challenging task for sound processing algorithms, systems, and also for humanoids. Using deep learning algorithms accurate detection of the direction of approaching sound signal can be achieved. In this paper, we propose a novel approach to detect the direction of approaching sound signals using two microphones as audio sensors and a Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) algorithm for estimation of sound direction. Two datasets are generated by capturing sound signals, the first with an increment of 1\degree (10 audio files per degree) and the second with an increment of 45 degree (1000 audio files per 45 degrees) for a 360-degree plane using two microphones. Then, the Interaural Level Difference (ILD) of the signal captured from both microphones is found and Mel Frequency Cepstral Coefficients (MFCC) of ILD are used to train the LSTM-RNN network. The network then extracts distinguishable features from the MFCC and learns to find the direction of the sound signal and performs azimuth estimation. 82 % and 95 % testing accuracy is achieved for the dataset with 1 degree precision and 45 degree precision respectively. The proposed method differs from other systems in a way that it has increased accuracy and it does not use available datasets. On the other hand, we have developed a dataset under normal acoustic conditions using which the network is trained and hence our system performs accurately in normal acoustic environments. The proposed system finds its applications for improved human-robot interaction, enhanced humanoid reflexes, sound direction-based rotation of the camera, and also sound direction-based detection of the arrival of an emergency vehicle.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call