Thai Word Segmentation Based on Sequence-to-Sequence Model

Hui Wen Wu,Xinyu Xu,Yong Zhong Huang,Can Cheng Li,Haoyu Zhuang

doi:10.1088/1742-6596/1757/1/012065

Thai Word Segmentation Based on Sequence-to-Sequence Model

Hui Wen Wu, Xinyu Xu + Show 3 more

Open Access

https://doi.org/10.1088/1742-6596/1757/1/012065

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Jan 1, 2021
Citations: 1	License type: cc-by

Affiliation: Guilin University of Electronic Technology

#Thai Word Segmentation #Word Segmentation + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Thai as a low-resource language has a large word segmentation performance improvement space. In this paper, we investigate a sequence-to-sequence model for Thai word segmentation with two different recurrent neural networks, which could transform one input sequence into another output sequence. Furthermore, we evaluate datasets in four different fields compared then with other multiple word segmentation models, and the F1 value in the encyclopedia dataset reaches 97.15%. The results show that the proposed model has superior performance and is more effective, it is worth mentioning that the expected results can be achieved even with limited data resources.

Full Text