Abstract

Lexicon is a collection of individual words in the language, which is essential for NLP (Natural Language Processing) research such as machine translation, word segmentation and speech processing. According to the computerize system applying to Isarn Dharma Alphabets, this research aims to collect important features to support research in natural language and speech processing field. In the study, Isarn Dharma Alphabets lexicon using Trie structure was constructed. The lexicon consists of Isarn Dharma Alphabets words, Thai words, English words, phonemes, parts of speech, sub-parts of speech, special characteristics, Thai descriptions, and English descriptions. The lexicon contains approximately 8,000 words. Moreover, Isarn Dharma Alphabets transcription system has been proposed based on linguistic rules.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call