Abstract
ABSTACT Sound source localization especially speech and speaker is sole of the most significant techniques recently because used in various applications like smart environments, industry, robots, and audio conferences. So, the usage of these techniques needs more accuracy. In this paper, a speaker localization proposed it depends on the speech signals in closed spaces by employing fusion techniques and neural networks (NN) algorithms to get more accuracy. The proposed work included finding the classification of the speaker signals, which included three phases: the preprocessing phase, the phase of the feature extraction and classification phase. Data Fusion technique used to generate the dataset of speakers. In feature extraction phase features fusion technique was used for constructing a feature vector by using Generalized Cross Correlation (GCC) for time delay estimation, Root_MUSIC, and Minimum Variance Distortion Less (MVDR) for a direction of arrival for the signal source. In the classification stage two NN algorithms used, Restricted Boltzmann Machine (RBM), which implemented using Tensor flow library and Long Short-Term Memory (LSTM), which implemented using Keras library. The experiments results shows that the accuracy of the two methods was 99.84%, 99.15% for RBM, and LSTM respectively.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have