Abstract

In Myanmar to English language translation system, in order to provide meaningful sentence from one language to another is non-trivial task. POS tagging is used as an early stage of linguistic text analysis in many applications. POS tagging is a process of assigning correct syntactic categories to each word. Tagsets and word disambiguation rules are fundamental parts of any POS tagger. This paper presents a new approach for POS tagging of Myanmar Language. Firstly, Users input a simple Myanmar sentence and then this sentence is segmented into words by using segmentation rules. These words are assigned to appropriate syntactic categories of Myanmar language by using rule based and probabilistic approach. This system applied CRF method for tagging POS ambiguities on words. CRF is a framework for building discriminative probabilistic models for segmenting and labeling sequential data. The tagsets for Myanmar POS, segmentation rule, tagging algorithm and CRF method are designed. The proposed approach is used UCSM Lexicon. So, this hybrid approach for POS tagging can give the optimal accuracy and robustness of machine translation system.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call