VTLN-warped Gaussian posteriorgram for QbE-STD

Maulik C Madhavi,Hemant A Patil

doi:10.23919/eusipco.2017.8081270

Abstract

Vocal Tract Length Normalization (VTLN) is a very important speaker normalization technique for speech recognition tasks. In this paper, we propose the use of Gaussian posteriorgram of VTLN-warped spectral features for a Query-by-Example Spoken Term Detection (QbE-STD). This paper presents the use of a Gaussian Mixture Model (GMM) framework for estimation of VTLN warping factor. This GMM framework does not require phoneme-level transcription and hence, it can be useful for unsupervised tasks. We propose the iterative approach for VTLN warping factor estimation with two GMM training approaches, namely, Expectation-Maximization (EM) and Deterministic Annealing-Expectation Maximization (DAEM). The VTLN-warped Gaussian posteriorgram gave the better QbE-STD performance. The performance of TIMIT QbE-STD was investigated with different evaluation factors, such as a number of Gaussian components in GMM, various local constraints, and a number of iterations in VTLN warping factor estimation. VTLN-warped Gaussian posteriorgram reduces the speaker-specific variation in Gaussian posteriorgram and hence, it is expected to give better performance than Gaussian posteriorgram.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

VTLN-warped Gaussian posteriorgram for QbE-STD

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Aug 1, 2017
Citations: 23	License type: cc-by

Similar Papers

Vocal Tract Length Normalization using a Gaussian mixture model framework for query-by-example spoken term detection
Maulik C Madhavi ... Hemant A Patil
Computer Speech & Language | VOL. 58
Maulik C Madhavi, et. al.Maulik C Madhavi ... Hemant A Patil
23 Mar 2019
Computer Speech & Language | VOL. 58

Design of mixture of GMMs for Query-by-Example Spoken Term Detection
Maulik C Madhavi ... Hemant A Patil
Computer Speech & Language | VOL. 52
Maulik C Madhavi, et. al.Maulik C Madhavi ... Hemant A Patil
04 May 2018
Computer Speech & Language | VOL. 52

Spoken Term Detection Techniques
Leena Mary ... Deekshitha G
-
Leena Mary, et. al.Leena Mary ... Deekshitha G
26 Sep 2018
26 Sep 2018

Analysis of constraints on segmental DTW for the task of query-by-example spoken term detection
Sri Harsha Dumpala ... Anil Kumar Vuppala
-
Sri Harsha Dumpala, et. al.Sri Harsha Dumpala ... Anil Kumar Vuppala
01 Dec 2015
01 Dec 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VTLN-warped Gaussian posteriorgram for QbE-STD

Abstract

Talk to us

Similar Papers