Abstract

Part of Speech (POS) tagging is one of the fundamental steps in various speech and text processing applications. POS tagging is the process of assigning the words in input sentences with their categories according to their contextual and grammatical properties. In addition to the general POS tagging difficulties such as the disambiguation of multi-category words and unknown words, the Persian language, unlike the English language, is a free order language and it has its own characteristics. These challenges can greatly affect the quality of the part-of-speech tagging process. An efficient POS tagging process has been developed for some languages, especially for the English language, but just a few researches have been done on the Persian language. To address these issues and achieve high POS tagging accuracy, we chose features which can show the important characteristics of words in a sentence, as well as maximum entropy as a machine learning classifier. Experimental results show that the proposed Persian POS tagging system outperforms the other state-of-the-art Persian taggers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.