Syntactic and Semantic Features For Code-Switching Factored Language Models

Heike Adel,Tanja Schultz,Katrin Kirchhoff,Dominic Telaar,Ngoc Thang Vu

doi:10.1109/taslp.2015.2389622

Abstract

This paper presents our latest investigations on different features for factored language models for Code-Switching speech and their effect on automatic speech recognition (ASR) performance. We focus on syntactic and semantic features which can be extracted from Code-Switching text data and integrate them into factored language models. Different possible factors, such as words, part-of-speech tags, Brown word clusters, open class words and clusters of open class word embeddings are explored. The experimental results reveal that Brown word clusters, part-of-speech tags and open-class words are the most effective at reducing the perplexity of factored language models on the Mandarin-English Code-Switching corpus SEAME. In ASR experiments, the model containing Brown word clusters and part-of-speech tags and the model also including clusters of open class word embeddings yield the best mixed error rate results. In summary, the best language model can significantly reduce the perplexity on the SEAME evaluation set by up to 10.8% relative and the mixed error rate by up to 3.4% relative.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Syntactic and Semantic Features For Code-Switching Factored Language Models

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Mar 1, 2015
Citations: 60

Similar Papers

Combining recurrent neural networks and factored language models during decoding of code-Switching speech
Heike Adel ... Tanja Schultz
-
Heike Adel, et. al.Heike Adel ... Tanja Schultz
14 Sep 2014
14 Sep 2014

Differences in brain potentials to open and closed class words: class and frequency effects
Thomas F Münte ... Sönke Johannes
Neuropsychologia | VOL. 39
Thomas F Münte, et. al.Thomas F Münte ... Sönke Johannes
13 Dec 2000
Neuropsychologia | VOL. 39

Novel speech processing techniques for robust automatic speech recognition

-

01 Jan 2006
01 Jan 2006

Weighted finite-state transducer-based dysarthric speech recognition error correction using context-dependent pronunciation variation modelling
Woo Kyeong Seong ... Ji Hun Park
International Journal of Engineering Systems Modelling and Simulation | VOL. 6
Woo Kyeong Seong, et. al.Woo Kyeong Seong ... Ji Hun Park
01 Jan 2014
International Journal of Engineering Systems Modelling and Simulation | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Syntactic and Semantic Features For Code-Switching Factored Language Models

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing