Acoustic and language models adaptation for Indonesian spontaneous speech recognition

Dessi Puji Lestari,Angela Irfani

doi:10.1109/icaicta.2015.7335375

Dessi Puji Lestari, Angela Irfani

https://doi.org/10.1109/icaicta.2015.7335375

Copy DOI

Export

Save

Cite

Publication Date: Aug 1, 2015

Citations: 7

Affiliation: Bandung Institute of Technology

Abstract
Full-Text
Similar Papers

Abstract

Listen

Performance of Indonesian Automatic Speech Recognition is decreased significantly when recognizing spontaneous speech. Spontaneous speech has particular characteristics differ from read speech both in acoustic and language rule. In spontaneous speech, the pronunciation and expression of the speech varies depending on the speaker fluency and the topic. Disfluencies in speech disrupt a fluent sentence and more often violates the rule of the formal language. To improve Indonesian automatic speech recognizer to recognize spontaneous speech, several model enhancement methods was conducted by adding spontaneous data and retrain both acoustic model and language model using those data, by adapting the acoustic model based on the maximum likelihood linear regression and maximum a posteriori approach, and by adapting the language model employing the language model linear interpolation. Experimental results show all methods are effective in increasing the capability of the Indonesian automatic speech recognizer to recognize spontaneous data. However, all methods decreased the accuracy of read speech recognition. On average, retraining both acoustic and language models using combination of read and spontaneous data is more effective than conducting model adaptation. The absolute improvement of 28.34% accuracy is achieved after retraining both language model and acoustic model using combination of read data and spontaneous data.

Full Text