Neural network based speaker classification and verification systems with enhanced features

Zhenhao Ge,Srinath Cheluvaraja,Aravind Ganapathiraju,Ananth N Iyer,Ram Sundaram

doi:10.1109/intellisys.2017.8324265

Abstract

This work presents a novel framework based on feed-forward neural network for text-independent speaker classification and verification, two related systems of speaker recognition. With optimized features and model training, it achieves 100% classification rate in classification and less than 6% Equal Error Rate (ERR), using merely about 1 second and 5 seconds of data respectively. Features with stricter Voice Active Detection (VAD) than the regular one for speech recognition ensure extracting stronger voiced portion for speaker recognition, speaker-level mean and variance normalization helps to eliminate the discrepancy between samples from the same speaker. Both are proven to improve the system performance. In building the neural network speaker classifier, the network structure parameters are optimized with grid search and dynamically reduced regularization parameters are used to avoid training terminated in local minimum. It enables the training goes further with lower cost. In speaker verification, performance is improved with prediction score normalization, which rewards the speaker identity indices with distinct peaks and penalizes the weak ones with high scores but more competitors, and speaker-specific thresholding, which significantly reduces ERR in the ROC curve. TIMIT corpus with 8K sampling rate is used here. First 200 male speakers are used to train and test the classification performance. The testing files of them are used as in-domain registered speakers, while data from the remaining 126 male speakers are used as out-of-domain speakers, i.e. imposters in speaker verification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Neural network based speaker classification and verification systems with enhanced features

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Speaker verification with TIMIT corpus - some remarks on classical methods
Adam Dustor
-
Adam DustorAdam Dustor
23 Sep 2020
23 Sep 2020

Bottleneck and Embedding Representation of Speech for DNN-based Language and Speaker Recognition
Alicia Lozano-Diez ... Javier Gonzalez-Dominguez
-
Alicia Lozano-Diez, et. al.Alicia Lozano-Diez ... Javier Gonzalez-Dominguez
21 Nov 2018
21 Nov 2018

Speaker Identification Using Empirical Mode Decomposition-Based Voice Activity Detection Algorithm under Realistic Conditions
M.S Rudramurthy ... Nilabh Kumar Pathak
Journal of Intelligent Systems | VOL. 23
M.S Rudramurthy, et. al.M.S Rudramurthy ... Nilabh Kumar Pathak
02 Apr 2014
Journal of Intelligent Systems | VOL. 23

Text-independent speaker recognition using non-linear frame likelihood transformation
Konstantin P Markov ... Seiichi Nakagawa
Speech Communication | VOL. 24
Konstantin P Markov, et. al.Konstantin P Markov ... Seiichi Nakagawa
01 Jun 1998
Speech Communication | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Neural network based speaker classification and verification systems with enhanced features

Abstract

Talk to us

Similar Papers