Abstract

Myanmar language also known as Burmese language is a free order of word language and syntactic patterns of one word can vary based on the position and the structure in the sentence. There are many ambiguous part of speech (POS) tags on one word in the sentence of Myanmar text. This research work presents disambiguation for the POS of the words in written Myanmar text. We aim at removing this ambiguity on Myanmar word and assigning single POS to each word of sentence. This is demonstrated on the following ideas: (i) input the sentence and segmented into words using syllable segmentation rules and forward maximum matching approach with monolingual Myanmar dictionary and (ii) apply the Joint Entropy (JE) for POS ambiguous for each word in the sentence with monolingual Myanmar tagged corpus. Joint probability value could be given the useful and accurate disambiguation of POS for free order and structure of words in Myanmar text. The monolingual Myanmar tagged corpus and tagged dictionary are created including 620 sentences and 15,000 words, respectively. This study attempts practical word segmentation and POS tagging system which can really overcome bottleneck of the machine translation system for Myanmar to other languages and research activities related to natural language processing (NLP).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.