Segmentation and Alignment of Chinese and Khmer Bilingual Names Based on Hierarchical Dirichlet Process

Bingbing Yu,Feng Zhou,Guangyi Xu,Qingling Lei,Xin Yan,Yu Nuo

doi:10.1007/978-3-030-00214-5_56

Bingbing Yu, Feng Zhou + Show 4 more

https://doi.org/10.1007/978-3-030-00214-5_56

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Transliteration is an important foundation of cross-language Natural Language Processing technology. In order to solve the problem of Khmer- Chinese name transliteration, the hierarchical Dirichlet process model allows multi-to-multi alignment, and also solves overfitting problems, and it can involve the influence of the previous syllable alignment on the alignment of the next syllable and effectively mine the latent information in the natural language, Therefore, Chinese and khmer name transliteration based on hierarchical Dirichlet process is proposed in this paper. This paper firstly builds a hierarchical Dirichlet process based on the Chinese-Khmer name alignment model, then, makes the aligned Chinese and Khmer syllables as the training corpus and uses the Mose to train the Chinese-Khmer transliteration model, finally, tests the performance of the Chinese-Khmer name alignment model by the effect of the transliteration model. The results show that the way of aligning Chinese and Khmer names by the alignment model based on hierarchical Dirichlet process firstly and transliterating the names next can get a better performance.

Full Text