Unsupervised Approach to Word Sense Disambiguation in Malayalam

K.P Sruthi Sankar,P.C Reghu Raj,V Jayan

doi:10.1016/j.protcy.2016.05.106

K.P Sruthi Sankar, P.C Reghu Raj + Show 1 more

Open Access

https://doi.org/10.1016/j.protcy.2016.05.106

Copy DOI

Abstract

Word Sense Disambiguation (WSD) is the task of identifying the correct sense of a word in a specific context when the word has multiple meaning. WSD is very important as an intermediate step in many Natural Language Processing (NLP) tasks, especially in Information Extraction(IE), Machine Translation(MT) and Question/Answering Systems. Word sense ambiguity arises when a particular word has more than one possible sense. The peculiarity of any language is that it includes a lot of ambiguous words. Since the sense of a word depends on its context of use, disambiguation process requires the understanding of word knowledge. Automatic WSD systems are available for structured languages like English, Chinese, etc. But Indian languages are morphologically rich and thus the processing task is very complex. The aim of this work is to develop a WSD system for Malayalam, a language spoken in India, predominantly used in the state of Kerala. The proposed system uses a corpus which is collected from various Malayalam web documents. For each possible sense of the ambiguous word, a relatively small set of training examples (seed sets) are identified which represents the sense. Collocations and most co-occurring words are considered as training examples. Seed set expansion module extends the seed set by adding most similar words to the seed set elements. These extended sets act as sense clusters. The most similar sense cluster to the input text context is considered as the sense of the target word.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Procedia Technology	Publication Date: Jan 1, 2016
Citations: 11	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Unsupervised Approach to Word Sense Disambiguation in Malayalam

Abstract

Talk to us

Similar Papers

More From: Procedia Technology

Lead the way for us

Similar Papers

Word vs. Class-Based Word Sense Disambiguation
Ruben Izquierdo ... German Rigau
Journal of Artificial Intelligence Research | VOL. 54
Ruben Izquierdo, et. al.Ruben Izquierdo ... German Rigau
09 Sep 2015
Journal of Artificial Intelligence Research | VOL. 54

An approach to reduce part of speech ambiguity using semantically annotated lexicon definitions
Andrei Minca ... Stefan Diaconescu
-
Andrei Minca, et. al.Andrei Minca ... Stefan Diaconescu
01 Sep 2012
01 Sep 2012

An Approach to Reduce Part of Speech Ambiguity Using Semantically Annotated Lexicon Definitions
Andrei Minc ... Tefan Diaconescu
-
Andrei Minc, et. al.Andrei Minc ... Tefan Diaconescu
01 Jan 2013
01 Jan 2013

A Survey of Different Approaches for Word Sense Disambiguation
Rasika Ransing ... Archana Gulati
-
Rasika Ransing, et. al.Rasika Ransing ... Archana Gulati
06 Nov 2022
06 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Approach to Word Sense Disambiguation in Malayalam

Abstract

Talk to us

Similar Papers

More From: Procedia Technology