Abstract

Background: Part of Speech (POS) Tagging is a process of defining the suitable part of speech for each word in the given context such as defining if a word is a verb, a noun or a particle. POS tagging is an important preprocessing step in many Natural Language Processing (NLP) applications such as question answering, text summarization, and information retrieval. Objective: The performance of NLP applications depends on the accuracy of POS taggers since assigning right tags for the words in a sentence enables the application to work properly after tagging. Many approaches have been proposed for the Arabic language, but more investigations are needed to improve the efficiency of Arabic POS taggers. Method: In this study, we propose a supervised POS tagging system for the Arabic language using Particle Swarm Optimization (PSO) and Genetic Algorithms (GA) as well as Hidden Markov Model (HMM). The tagging process is considered as an optimization problem and illustrated as a swarm which consists of group of particles. Each particle represents sequence of tags. The PSO algorithm is applied to find the best sequence of tags which represent the correct tags of the sentence. The genetic operators: crossover and mutation are used to find personal best, global best, and velocity of the PSO algorithm. HMM is used to find the fitness of particles in the swarm. Results : The performance of the proposed approach is evaluated on the KALIMAT dataset which consists of 18 million words and a tag set consists of 45 tags which covers all Arabic POS tags. The proposed tagger achieved an accuracy of 90.5%. Conclusion: Experimental results revealed that the proposed tagger achieved promising results compared to four existing approaches. Other approaches can identify only three tags: noun, verb and particle. Also, the accuracy for some tags are outperformed those achieved by other approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call