Abstract

Nowadays, data mining has become significant given the popularity of social networks as well as the emergence of abbreviated words, foreign terms and emoticons in the Persian language. Meanwhile, numerous studies have been conducted to identify the type of words. On the one hand, identifying the role of each word in a sentence is far more important than identifying the type of word in the sentence. On the other hand, the spelling-grammatical similarity of Persian to Arabic has enabled the newly proposed method in this paper to be applied to Arabic. In this paper, we adopted the Hidden Markov Model (MHM) and Tri-gram tagging with the aim of identifying the morphology of composition roles in Persian sentences. Then, a comparison was made between the technique developed in this paper and the Hidden Markov Model, Uni-gram and Bi-gram tagging. The proposed method supports the results obtained by the word role identification through "independent" and "dependent" roles and several factors that have a contribution to the words roles in sentences. In fact, the simulation results show that the average success rates of independent composition roles with MHM and Tri-gram tagging were 20.56% and 17.67% compared to Uni-gram and Bi-gram methods, respectively. Regarding the dependent composition role, there were improvements by 24.67% and 32.62%, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.