PTokenizer: POS tagger Tokenizer

Saeed Rahmani Seyyed,Mohammad Hadi Sadredini,Mostafa Fakhrahmad

doi:10.1109/kbei.2015.7436056

PTokenizer: POS tagger Tokenizer

Saeed Rahmani Seyyed, Mohammad Hadi Sadredini + Show 1 more

https://doi.org/10.1109/kbei.2015.7436056

Copy DOI

Publication Date: Nov 1, 2015

Citations: 2

Affiliation: Shiraz University

#Part Of Speech Taggers #Part Of Speech + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

By the advent of new information sources and the expansion of text data, natural language processing (NLP) has become one of the key parts of all the systems dealing with human written texts, and part of speech (POS) tagging is an inseparable part of all NLP tasks. As a result, it is of the paramount importance to enhance the accuracy of POS tagging. In this paper, applying language model and statistical information, we introduce a new approach to tokenize sentences and prepare them to be labeled by POS taggers. An evaluation shows that the proposed method yields a precision of 98 percent for tokenizing, and applying it to a Maximum Likelihood and TnT POS taggers achieve improvement in the accuracy of Persian POS tagging.

Full Text