This scientific article introduces a novel POS tagging system specifically developed for the Eastern Armenian language. The primary objective of the study is to conduct a comparative analysis of two well-established methods for part-of-speech (POS) tagging: the hidden Markov model (HMM) coupled with the Viterbi algorithm, and an artificial neural network in the form of a recurrent neural network. The study places particular emphasis on the Eastern Armenian language and employs the ArmSpeech-POS Eastern Armenian part-of-speech tagged corpus for conducting comprehensive experiments and evaluations. POS tagging is a fundamental task in natural language processing (NLP) that involves assigning grammatical tags to words in a given text. Accurate POS tagging is crucial for various NLP applications, including machine translation, information retrieval, and sentiment analysis. The Viterbi algorithm is a well-established probabilistic method that utilizes a hidden Markov model (HMM) to determine the most likely sequence of POS tags. On the other hand, RNNs, a type of deep learning model, can capture complex patterns and dependencies in sequential data. Experimental results indicate that both methods achieve reasonable accuracy in POS tagging for Eastern Armenian. However, the RNN outperforms the Viterbi algorithm, exhibiting higher accuracy rates. This can be attributed to the RNN’s ability to capture long-range dependencies and learn intricate linguistic patterns. The article concludes by discussing the implications of the study’s findings and potential areas for further research. It emphasizes the significance of accurate POS tagging for improving NLP applications in Eastern Armenian and suggests exploring advanced neural network architectures and incorporating linguistic features to enhance POS tagging performance.
Read full abstract