Towards Burmese (Myanmar) Morphological Analysis

Chenchen Ding,Hnin Thu Zar Aye,Masao Utiyama,Khin Mar Soe,Khin Thandar Nwet,Eiichiro Sumita,Win Pa Pa

doi:10.1145/3325885

Abstract

This article presents a comprehensive study on two primary tasks in Burmese (Myanmar) morphological analysis: tokenization and part-of-speech (POS) tagging. Twenty thousand Burmese sentences of newswire are annotated with two-layer tokenization and POS-tagging information, as one component of the Asian Language Treebank Project. The annotated corpus has been released under a CC BY-NC-SA license, and it is the largest open-access database of annotated Burmese when this manuscript was prepared in 2017. Detailed descriptions of the preparation, refinement, and features of the annotated corpus are provided in the first half of the article. Facilitated by the annotated corpus, experiment-based investigations are presented in the second half of the article, wherein the standard sequence-labeling approach of conditional random fields and a long short-term memory (LSTM)-based recurrent neural network (RNN) are applied and discussed. We obtained several general conclusions, covering the effect of joint tokenization and POS-tagging and importance of ensemble from the viewpoint of stabilizing the performance of LSTM-based RNN. This study provides a solid basis for further studies on Burmese processing.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ACM Transactions on Asian and Low-Resource Language Information Processing	Publication Date: May 31, 2019
Citations: 23	License type: cc-by-nc-sa

R Discovery Prime

R Discovery Prime

Towards Burmese (Myanmar) Morphological Analysis

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing

Lead the way for us

Similar Papers

Share Market Prediction Using Long Short Term Memory and Artificial Neural Network
J.Aruna Jasmine ... M Godson
-
J.Aruna Jasmine, et. al.J.Aruna Jasmine ... M Godson
16 Dec 2021
16 Dec 2021

Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network
Alex Sherstinsky
Physica D: Nonlinear Phenomena | VOL. 404
Alex SherstinskyAlex Sherstinsky
21 Jan 2020
Physica D: Nonlinear Phenomena | VOL. 404

Towards audio-based identification of Ethio-Semitic languages using recurrent neural network
Amlakie Aschale Alemu ... Ayodeji Olalekan Salau
Scientific Reports | VOL. 13
Amlakie Aschale Alemu, et. al.Amlakie Aschale Alemu ... Ayodeji Olalekan Salau
07 Nov 2023
Scientific Reports | VOL. 13

Comprehensive Deep Recurrent Artificial Neural Network (CDRANN): Evolutionary Model for Future Prediction
G Sundar ... P Patchaiammal
-
G Sundar, et. al.G Sundar ... P Patchaiammal
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards Burmese (Myanmar) Morphological Analysis

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing