Abstract
Identification of syntactic categories of speech that known as POS tagging are among the most fundamental operations in natural language processing. POS taggers are one of the important tools that should be considered in this regard because of their widespread use in natural language processing tasks. The first problem for designing Persian POS taggers systems is the extraction of probabilistic and statistical information as well as practical features from the Persian text corpus files. In this study, the probabilistic and statistical features and information obtained were used with a long short-term memory (LSTM) neural network with longer-term memory and were trained and tested. In the present study, to increase the accuracy, measures such as increasing the length of dependency on the previous words and bidirectional word assessment for the LSTM neural network were taken. To assess, the results were compared with those of other methods, with the best accuracy obtained with the two-word dependency with the bidirectional LSTM network. The accuracy rates of 98.1% was received using previous 5 words. Comparing the state-of-the-art results, the proposed scheme has better accuracy in all condition of POS tagging.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.