Spoken Language Identification with Deep Temporal Neural Network and Multi-levels Discriminative Cues

Linjia Sun

doi:10.1109/icicsp50920.2020.9232093

Abstract

The language cue is an important component in the task of spoken language identification (LID). But it will take a lot of time to align language cue to speech segment by the manual annotation of professional linguists. Instead of annotating the linguistic phonemes, we use the cooccurrence in speech utterances to find the underlying phoneme-like speech units by unsupervised means. Then, we model phonotactic constraint on the set of phoneme-like units for finding the larger speech segments called the suprasegmental phonemes, and extract the multi-levels language cues from them, including phonetic, phonotactic and prosodic. Further, a novel LID system is proposed based on the architecture of TDNN followed by LSTM-RNN. The proposed LID system is built and compared with the acoustic feature based methods and the phonetic feature based methods on the task of NIST LRE07 and Arabic dialect identification. The experimental results show that our LID system helps to capture robust discriminative information for short duration language identification and high accuracy for dialect identification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spoken Language Identification with Deep Temporal Neural Network and Multi-levels Discriminative Cues

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Spoken language identification with phonological and lexical models
Shubha Kadambe ... James L. Hieronymus
The Journal of the Acoustical Society of America | VOL. 95
Shubha Kadambe, et. al.Shubha Kadambe ... James L. Hieronymus
01 May 1994
The Journal of the Acoustical Society of America | VOL. 95

Spoken language identification using a genetic-based fusion approach to combine acoustic and universal phonetic results
Ashkan Moradi ... Yasser Shekofteh
Computers and Electrical Engineering | VOL. 105
Ashkan Moradi, et. al.Ashkan Moradi ... Yasser Shekofteh
21 Dec 2022
Computers and Electrical Engineering | VOL. 105

Language Identification with Unsupervised Phoneme-like Sequence and TDNN-LSTM-RNN
Linjia Sun
-
Linjia SunLinjia Sun
06 Dec 2020
06 Dec 2020

A comparison between phonetic engine and GMM–UBM classifier for language identification tasks
Sushanta Kabir Dutta ... Tanvira Ismail
Microsystem Technologies | VOL. 28
Sushanta Kabir Dutta, et. al.Sushanta Kabir Dutta ... Tanvira Ismail
06 May 2020
Microsystem Technologies | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spoken Language Identification with Deep Temporal Neural Network and Multi-levels Discriminative Cues

Abstract

Talk to us

Similar Papers