Abstract

Thai as a low-resource language has a large word segmentation performance improvement space. In this paper, we investigate a sequence-to-sequence model for Thai word segmentation with two different recurrent neural networks, which could transform one input sequence into another output sequence. Furthermore, we evaluate datasets in four different fields compared then with other multiple word segmentation models, and the F1 value in the encyclopedia dataset reaches 97.15%. The results show that the proposed model has superior performance and is more effective, it is worth mentioning that the expected results can be achieved even with limited data resources.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call