Bayesian Learning of a Language Model from Continuous Speech

Graham Neubig,Shinsuke Mori,Tatsuya Kawahara,Masato Mimura

doi:10.1587/transinf.e95.d.614

Graham Neubig, Shinsuke Mori + Show 2 more

Open Access

https://doi.org/10.1587/transinf.e95.d.614

Copy DOI

Abstract

We propose a novel scheme to learn a language model (LM) for automatic speech recognition (ASR) directly from continuous speech. In the proposed method, we first generate phoneme lattices using an acoustic model with no linguistic constraints, then perform training over these phoneme lattices, simultaneously learning both lexical units and an LM. As a statistical framework for this learning problem, we use non-parametric Bayesian statistics, which make it possible to balance the learned model's complexity (such as the size of the learned vocabulary) and expressive power, and provide a principled learning algorithm through the use of Gibbs sampling. Implementation is performed using weighted finite state transducers (WFSTs), which allow for the simple handling of lattice input. Experimental results on natural, adult-directed speech demonstrate that LMs built using only continuous speech are able to significantly reduce ASR phoneme error rates. The proposed technique of joint Bayesian learning of lexical units and an LM over lattices is shown to significantly contribute to this improvement.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEICE Transactions on Information and Systems	Publication Date: Jan 1, 2012
Citations: 38	License type: free

R Discovery Prime

R Discovery Prime

Bayesian Learning of a Language Model from Continuous Speech

Abstract

Talk to us

Similar Papers

More From: IEICE Transactions on Information and Systems

Lead the way for us

Similar Papers

UNFOLD
Reza Yazdani ... Antonio González
-
Reza Yazdani, et. al.Reza Yazdani ... Antonio González
14 Oct 2017
14 Oct 2017

Exploring recurrent neural network based acoustic and linguistic modeling for children's speech recognition
Sreeram Ganji ... Rohit Sinha
-
Sreeram Ganji, et. al.Sreeram Ganji ... Rohit Sinha
01 Nov 2017
01 Nov 2017

A discriminative model for continuous speech recognition based on Weighted Finite State Transducers
Shinji Watanabe ... Erik Mcdermott
-
Shinji Watanabe, et. al.Shinji Watanabe ... Erik Mcdermott
01 Jan 2009
01 Jan 2009

A Review of On-Device Fully Neural End-to-End Automatic Speech Recognition Algorithms
Chanwoo Kim ... Sungsoo Kim
-
Chanwoo Kim, et. al.Chanwoo Kim ... Sungsoo Kim
01 Nov 2020
01 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayesian Learning of a Language Model from Continuous Speech

Abstract

Talk to us

Similar Papers

More From: IEICE Transactions on Information and Systems