Abstract

In this work, the tone modeling approaches are used manifest the tonal structure of Vietnamese and tonal feature is also used to build acoustic models. The results on LVCSR using deep bottleneck features (DBNFs) and different types of pronouncing dictionary, are also presented. The experiments are carried out on the dataset containing speeches on Voice of Vietnam channel (VOV). The results show that the performance of the system using tonal phoneme obtained relative improvements over the best non-tonal phoneme system by 19.25%. The DBNFs systems are applicable on tonal dictionary and adding tonal feature as input feature of the network reached around 18% relative recognition performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call