Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks

Gurpreet Kaur,Mohit Srivastava,Amod Kumar

doi:10.26636/jtit.2018.119617

Abstract

Huge growth is observed in the speech and speaker recognition ﬁeld due to many artiﬁcial intelligence algorithms being applied. Speech is used to convey messages via the language being spoken, emotions, gender and speaker identity. Many real applications in healthcare are based upon speech and speaker recognition, e.g. a voice-controlled wheelchair helps control the chair. In this paper, we use a genetic algorithm (GA) for combined speaker and speech recognition, relying on optimized Mel Frequency Cepstral Coeﬃcient (MFCC) speech features, and classiﬁcation is performed using a Deep Neural Network (DNN). In the ﬁrst phase, feature extraction using MFCC is executed. Then, feature optimization is performed using GA. In the second phase training is conducted using DNN. Evaluation and validation of the proposed work model is done by setting a real environment, and eﬃciency is calculated on the basis of such parameters as accuracy, precision rate, recall rate, sensitivity, and speciﬁcity. Also, this paper presents an evaluation of such feature extraction methods as linear predictive coding coeﬃcient (LPCC), perceptual linear prediction (PLP), mel frequency cepstral coefﬁcients (MFCC) and relative spectra ﬁltering (RASTA), with all of them used for combined speaker and speech recognition systems. A comparison of diﬀerent methods based on existing techniques for both clean and noisy environments is made as well.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Telecommunications and Information Technology	Publication Date: Jun 29, 2018
Citations: 8	License type: cc-by

R Discovery Prime

Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks

Abstract

Published Version

Talk to us

Similar Papers

More From: Journal of Telecommunications and Information Technology

Lead the way for us

Similar Papers

Performance Analysis of various Front-end and Back End Amalgamations for Noise-robust DNN-based ASR
Mohit Dua ... Vinam Agrawal
Recent Advances in Computer Science and Communications | VOL. 14
Mohit Dua, et. al.Mohit Dua ... Vinam Agrawal
01 Dec 2021
Recent Advances in Computer Science and Communications | VOL. 14

Mel Frequency Cepstral Coefficients (MFCC) based speaker identification in noisy environment using wiener filter
Paresh M Chauhan ... Nikita P Desai
-
Paresh M Chauhan, et. al.Paresh M Chauhan ... Nikita P Desai
01 Mar 2014
01 Mar 2014

Bottleneck and Embedding Representation of Speech for DNN-based Language and Speaker Recognition
Alicia Lozano-Diez ... Javier Gonzalez-Dominguez
-
Alicia Lozano-Diez, et. al.Alicia Lozano-Diez ... Javier Gonzalez-Dominguez
21 Nov 2018
21 Nov 2018

Effects of Noise on RASTA-PLP and MFCC based Bangla ASR Using CNN
Md Raffael Maruf ... Nazmun Nahar Nelima
-
Md Raffael Maruf, et. al.Md Raffael Maruf ... Nazmun Nahar Nelima
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks

Abstract

Published Version

Talk to us

Similar Papers

More From: Journal of Telecommunications and Information Technology