Bi-Directional Hybrid Deep Learning model for Speaker Iden-tification

Wondimu Lambamo,Ali Alzahrani,Ramasamy Srinivasagan,Worku Jifara

doi:10.47679/ijasca.v3i1.43

Abstract

Speaker identification is the process of automatically determining who is speaking from the known speakers by the model. It is crucial in voice-based authentication, forensic investigations, security and surveillance. In recent studies, the combination of convolutional neural network (CNN) and recurrent neural network (RNN) variants performed better than separate models of both. However, only limited studies are conducted in speaker identification using a combination of CNN and RNN variants. In this study, we proposed speaker identification using hybrid two-dimensional CNN (2DCNN) and bidirectional gated recurrent unit (BiGRU) to improve performance. The proposed model integrates the advantage of 2DCNN and BiGRU layers to improve the performance of the model. 2DCNN layers have the advantage of extracting short-term spatial features from input data and it has a limited number of parameters for computation. BiGRU layers have an advantage in extracting long-term temporal dependency between the features in both directions (i.e. backward and forward) and it is efficient in achieving convergence during training. Spectrograms of the speech were used as input in our proposed model because of the rich acoustic features of the speaker. To compare the performance of the proposed model, additional experiments are conducted using the models 2DCNN, CNN-LSTM, CNN-BiLSTM and CNN-GRU. The experiments were conducted on the VoxCeleb1 audio dataset, which consists of 153,516 utterances collected from the 1251 speakers. The accuracy, precision, recall and f1 score of the proposed model are 98.28%, 99.08%, 98.92% and 98.97% respectively. The proposed model was compared with the existing works to show the effectiveness of the proposed model. The experiment results and the comparison with the existing works show that the proposed model has higher performance than both existing works and other models experimented in this study.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bi-Directional Hybrid Deep Learning model for Speaker Iden-tification

Abstract

Talk to us

Similar Papers

More From: International Journal of Advanced Science and Computer Applications

Lead the way for us

Journal: International Journal of Advanced Science and Computer Applications	Publication Date: Nov 8, 2023
License type: CC BY 4.0

Similar Papers

Speaker identification under noisy conditions using hybrid convolutional neural network and gated recurrent unit
Wondimu Lambamo Anito ... Ramasamy Srinivasagan
IAES International Journal of Artificial Intelligence (IJ-AI) | VOL. 13
Wondimu Lambamo Anito, et. al.Wondimu Lambamo Anito ... Ramasamy Srinivasagan
01 Mar 2024
IAES International Journal of Artificial Intelligence (IJ-AI) | VOL. 13

A novel recurrent neural network approach in forecasting short term solar irradiance
Mustafa Jaihuni ... Hyeon Tae Kim
ISA Transactions | VOL. 121
Mustafa Jaihuni, et. al.Mustafa Jaihuni ... Hyeon Tae Kim
29 Mar 2021
ISA Transactions | VOL. 121

Generating Videos Based on Convolutional Recurrent Generative Adversarial Networks
Yachao Li ... Toshihiro Komma
-
Yachao Li, et. al.Yachao Li ... Toshihiro Komma
07 Jul 2018
07 Jul 2018

Performance Evaluation of Recurrent Neural Networks Applied to Indoor Camera Localization
Muhammad S Alam ... Akm B Hossain
International Journal of Emerging Technology and Advanced Engineering | VOL. 12
Muhammad S Alam, et. al.Muhammad S Alam ... Akm B Hossain
02 Aug 2022
International Journal of Emerging Technology and Advanced Engineering | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bi-Directional Hybrid Deep Learning model for Speaker Iden-tification

Abstract

Talk to us

Similar Papers

More From: International Journal of Advanced Science and Computer Applications