Part of Speech Tagging for Tamil Language Using Deep Learning

Hemakasiny Visuwalingam,Roshan G Ragel,Ratnasingam Sakuntharaj

doi:10.1109/iciis53135.2021.9660738

Abstract

Part of Speech (POS) tagging is the process of marking up a word in a sentence to a corresponding part of speech. POS tagging is considered one of the pre-processing steps in Natural Language Processing (NLP) applications such as speech recognition, machine translation and sentiment analysis. A few works have been conducted to determine the POS tags for the Tamil words. However, the performance of the POS tagger with unknown words (words that do not appear in the lexicon) is not explored in the literature. The appearance of unknown words is a frequently occurring problem in POS tagging because, in real-world use, the NLP application will encounter words that are not in its lexicon. This paper proposes a deep learning-based POS tagger for the Tamil language using Bi-directional Long Short Term Memory (BLSTM). Our experiments use two corpora, one is AU-KBC annotated corpus, and the other is MeitY corpus. We also analysed the performance of the POS tagger with unknown words. Test results show that the POS tags for Tamil words determined by this approach have 99.8%, 99.5% and 96.5% accuracies for only known words, around 9.8% unknown words and 47.6% unknown words in test sentences respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Part of Speech Tagging for Tamil Language Using Deep Learning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Improving Persian POS tagging using the maximum entropy model
Ahmad A Kardan ... Maryam Bahojb Imani
-
Ahmad A Kardan, et. al.Ahmad A Kardan ... Maryam Bahojb Imani
01 Feb 2014
01 Feb 2014

Building Machine Learning System with Deep Neural Network for Text Processing
Shashi Pal Singh ... Hemant Darbari
-
Shashi Pal Singh, et. al.Shashi Pal Singh ... Hemant Darbari
17 Aug 2017
17 Aug 2017

Part of speech tagging: a systematic review of deep learning and machine learning approaches
Alebachew Chiche ... Betselot Yitagesu
Journal of Big Data | VOL. 9
Alebachew Chiche, et. al.Alebachew Chiche ... Betselot Yitagesu
24 Jan 2022
Journal of Big Data | VOL. 9

Implementation of Kadazan Tagger Based on Brill's Method
Marylyn Alex ... Lailatul Qadri Zakaria
Journal of ICT Research and Applications | VOL. 7
Marylyn Alex, et. al.Marylyn Alex ... Lailatul Qadri Zakaria
01 Dec 2013
Journal of ICT Research and Applications | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Part of Speech Tagging for Tamil Language Using Deep Learning

Abstract

Talk to us

Similar Papers