Exploring tonal information for Lhasa dialect acoustic modeling

Jianwu Dang,Gyaltsen Lobsang,Hongcui Wang,Longbiao Wang,Kuntharrgyal Khuru,Jian Li

doi:10.1109/iscslp.2016.7918447

Abstract

Detailed analysis of tonal features for Tibetan Lhasa dialect is an important task for Tibetan automatic speech recognition (ASR) applications. However, it is difficult to utilize tonal information because it remains controversial how many tonal patterns the Lhasa dialect has. Therefore, few studies have focused on modeling the tonal information of the Lhasa dialect for speech recognition purpose. For this reason, we investigated influences of the tonal information on the performance of Lhasa Tibetan speech recognition. Since Lhasa Tibetan has no conclusive tonal pattern yet, in this study, we used a four-tone pattern and designed a phone set based on the four contour contrasts scheme. Speech recognition performance was examined using the acoustic model with and without the pitch-related features. The experimental results showed that the character error rate (CER) was improved 11% after applying the tone based phone set and pitch-related features to DNN-HMM based speech recognition by comparing to that without tonal information. This preliminary study revealed that the tonal information plays an important role in speech recognition of Tibetan Lhasa dialect.

Full Text