Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition

Hung-Shin Lee,Yu Tsao,Shyh-Kang Jeng,Hsin-Min Wang

doi:10.1109/taslp.2020.3037457

Abstract

Phonotactic constraints can be employed to distinguish languages by representing a speech utterance as a multinomial distribution or phone events. In the present study, we propose a new learning mechanism based on subspace-based representation, which can extract concealed phonotactic structures from utterances, for language verification and dialect/accent identification. The framework mainly involves two successive parts. The first part involves subspace construction. Specifically, it decodes each utterance into a sequence of vectors filled with phone-posteriors and transforms the vector sequence into a linear orthogonal subspace based on low-rank matrix factorization or dynamic linear modeling. The second part involves subspace learning based on kernel machines, such as support vector machines and the newly developed subspace-based neural networks (SNNs). The input layer of SNNs is specifically designed for the sample represented by subspaces. The topology ensures that the same output can be derived from identical subspaces by modifying the conventional feed-forward pass to fit the mathematical definition of subspace similarity. Evaluated on the “General LR” test of NIST LRE 2007, the proposed method achieved up to 52%, 46%, 56%, and 27% relative reductions in equal error rates over the sequence-based PPR-LM, PPR-VSM, and PPR-IVEC methods and the lattice-based PPR-LM method, respectively. Furthermore, on the dialect/accent identification task of NIST LRE 2009, the SNN-based system performed better than the aforementioned four baseline methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2020
Citations: 76

Similar Papers

System combination using auxiliary information for speaker verification
Luciana Ferrer ... Elizabeth Shriberg
-
Luciana Ferrer, et. al.Luciana Ferrer ... Elizabeth Shriberg
01 Mar 2008
01 Mar 2008

Speaker verification based on fusion of acoustic and articulatory information
Ming Li ... Shrikanth Narayanan
-
Ming Li, et. al.Ming Li ... Shrikanth Narayanan
25 Aug 2013
25 Aug 2013

Sentence‐HMM state‐based i‐vector/PLDA modelling for improved performance in text dependent single utterance speaker verification
Osman Büyük
IET Signal Processing | VOL. 10
Osman BüyükOsman Büyük
01 Oct 2016
IET Signal Processing | VOL. 10

Out-of-domain detection based on confidence measures from multiple topic classification
L.R Lane ... T Matsui
-
L.R Lane, et. al.L.R Lane ... T Matsui
17 May 2004
17 May 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing