Evaluation of smoothing techniques for language modeling in automatic filipino speech recognition

Federico M Ang,Juan Carlo Miguel C Ancheta,Krisel G Chua,Karmela Mariz F Francia

doi:10.1109/tencon.2012.6412249

Abstract

It is widely known that smoothing techniques are essential for n-gram-based statistical language modeling, especially in large vocabulary continuous speech recognition (LVCSR) tasks. The goal in this paper is to investigate several smoothing algorithms for n-gram models in Filipino LVCSR. The automatic speech recognition system was developed using the Janus Speech Recognition Toolkit (JRTk) of Carnegie Mellon University and Karlsruhe Institute of Technology. The language models were generated using Stanford's language modeling toolkit, SRILM. The data consisted of approximately 60 hours of transcribed recordings of Filipino speech from several domains spoken by 156 speakers. A total of 24 systems employing different language models were fine-tuned and tested for improved performance at a base metric. An instance of the Kneser-Ney algorithm with modified-at-end counts applied to an n-gram of order 5 registered the highest word recognition accuracy at 80.9% and 81.3% for the development and evaluation tests, respectively.

Full Text