Abstract
Machine Translation, Information Retrieval and Knowledge Acquisition are the three main applications of Word Sense Disambiguation (WSD). The sense of a target word can be identified from a dictionary using a ‘bag of words’, i.e. neighbours of the target word. A target word has the same spelling of the word but with a different meaning, i.e. chair, light etc. In WSD, the key input sources are sentences and target words. But, instead of providing a target word, this should automatically be detected. If a sentence has more than one target word, then the filtration process will require further processing. In this study, the proposed framework, consisting of buzz words and query words has been developed to detect target words using the WordNet dictionary. Buzz words are defined as a ‘bag-of-words’ using POS-Tags, and query words are those words having multiple meanings. The proposed framework will endeavor to find the sense of the detected target word using its gloss and with examples containing buzz words. This is a semi-supervised approach because 266 words of multiple meanings have been labelled from various sources and used based on an unsupervised approach to detect the target word and sense (meaning). After experimenting on a dataset consisting of 300 hotel reviews, 100 % of the target words for each sentence were detected with 84 % related to the sense of each sentence or phrase.
Highlights
Choosing the correct sense in a context is related to Word Sense Disambiguation (DWS) because most words have multiple meanings, i.e. the word ―run‖ has 179 meanings of the word while the word ―take‖ has 127 different definitions of the word [1]
We propose to develop a method to filter a target word from multiple ambiguous words with the help of using buzz words and query words using a lexicon of multiple meaning words list MMWL; and Generate a correct sense of target words with the help of buzz words using gloss and examples of target words from the lexicon of WordNet
The evaluation strategy of Word Sense Disambiguation (WSD) is based on the correctness of sense selection of an ambiguous word invoked in a context according to human judgment
Summary
Choosing the correct sense in a context is related to Word Sense Disambiguation (DWS) because most words have multiple meanings, i.e. the word ―run‖ has 179 meanings of the word while the word ―take‖ has 127 different definitions of the word [1]. Knowledgebased WSD systems exploit the information in a lexical knowledge base, such as WordNet and Wikipedia, to perform WSD These approaches usually choose the sense with the definition most like the context of the ambiguous word, using textual overlap or using graph-based measures [4]. Called corpus-based approaches, do not make use of any knowledge resources for disambiguation These approaches range from supervised learning [5], in which a classifier is trained for each distinct word in a corpus of manually sense-annotated examples, to entirely unsupervised methods that cluster the occurrence of words, thereby inducing senses. The following contributions in this study are as follows: We propose to develop a method to filter a target word (sense required) from multiple ambiguous words with the help of using buzz words and query words using a lexicon of multiple meaning words list MMWL; and Generate a correct sense of target words with the help of buzz words using gloss and examples of target words from the lexicon of WordNet
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have