An ASR system using MFCC and VQ/GMM with emphasis on environmental dependency

Bidhan Barai,Subhadip Basu,Nibaran Das,Mita Nasipuri,Debayan Das

doi:10.1109/calcon.2017.8280756

Abstract

Automatic speaker recognition (ASR), also known as voice biometric recognition, remains very popular research area over six decades. Among all the acoustic features that are used in ASR, Mel-frequency Cepstral Coefficients (MFCCs) and Gammatone Frequency Cepstral Coefficients (GFCCs) are the most popular ones. However to make ASR environment independent, Relative Spectral Amplitude (RSATA) filtering techniques before feature extraction and feature, model, and score (in classification step) domain normalization techniques are applied. The techniques for modeling/classification that are used in present days are Vector Quantization (VQ), Support Vector Machine (SVM), Gaussian Mixture Models (GMMs), Hidden Markov Model (HMM), Artificial Neural Network (ANN), Deep Neural Network (DNN). In this paper we cite our experimental results upon three databases, namely, Hyke-2011, ELSDSR and IITG-MV SR Phase-I, based on MFCCs and VQ/GMM where Maximum Log-Likelihood (MLL) scoring technique is used for the recognition of speakers. The experimental results in the environmental mismatch condition for the IITG-MV SR Phase I & II databases are provided with explanation of accuracy degradation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An ASR system using MFCC and VQ/GMM with emphasis on environmental dependency

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

VQ/GMM-Based Speaker Identification with Emphasis on Language Dependency
Bidhan Barai ... Nibaran Das
-
Bidhan Barai, et. al.Bidhan Barai ... Nibaran Das
01 Jan 2019
01 Jan 2019

Closed-Set Text-Independent Automatic Speaker Recognition System Using VQ/GMM
Bidhan Barai ... Subhadip Basu
-
Bidhan Barai, et. al.Bidhan Barai ... Subhadip Basu
01 Jan 2018
01 Jan 2018

Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization
M Afzal Hossan ... Mark A Gregory
International Journal of Speech Technology | VOL. 16
M Afzal Hossan, et. al.M Afzal Hossan ... Mark A Gregory
28 Jun 2012
International Journal of Speech Technology | VOL. 16

Closed-Set Device-Independent Speaker Identification Using CNN
Tapas Chakraborty ... Subhadip Basu
-
Tapas Chakraborty, et. al.Tapas Chakraborty ... Subhadip Basu
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An ASR system using MFCC and VQ/GMM with emphasis on environmental dependency

Abstract

Talk to us

Similar Papers