Prediction of POS Tagging for Unknown Words for Specific Hindi and Marathi Language

Kirti Chiplunkar,Suresh Limkar,Tejaswini Chaudhari,Meghna Kharche,Saurabh Shaligram

doi:10.1007/978-981-15-5679-1_13

Prediction of POS Tagging for Unknown Words for Specific Hindi and Marathi Language

Kirti Chiplunkar, Suresh Limkar + Show 3 more

https://doi.org/10.1007/978-981-15-5679-1_13

Copy DOI

Publication Date: Aug 30, 2020

Citations: 2

Affiliation: International Institute of Information Technology, Tech Mahindra (India)

#Part Of Speech Tagging #Marathi Language + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Part of Speech (POS) tagging for Indian languages like Hindi and Marathi is generally not an investigated territory. Some of the best taggers accessible for Indian dialects utilize crossbreeds of machine learning or stochastic techniques and phonetic information. Available corpuses for Hindi and Marathi are limited. Hence, when Natural Language Processing (NLP) is applied to Hindi and Marathi sentences, desired results are not achieved. Current POS tagging techniques give UNKNOWN (UNK) POS tag for words which are not present in the corpus. This paper proposes how Hidden Markov Model (HMM)-based approach for POS tagging can be extended using Naive Bayes theorem for prediction of UNK POS tag.

Full Text