Word Sense Disambiguation Using Prior Probability Estimation Based on the Korean WordNet

Minho Kim,Hyuk-Chul Kwon

doi:10.3390/electronics10232938

Minho Kim, Hyuk-Chul Kwon

Open Access

https://doi.org/10.3390/electronics10232938

Copy DOI

Abstract

Supervised disambiguation using a large amount of corpus data delivers better performance than other word sense disambiguation methods. However, it is not easy to construct large-scale, sense-tagged corpora since this requires high cost and time. On the other hand, implementing unsupervised disambiguation is relatively easy, although most of the efforts have not been satisfactory. A primary reason for the performance degradation of unsupervised disambiguation is that the semantic occurrence probability of ambiguous words is not available. Hence, a data deficiency problem occurs while determining the dependency between words. This paper proposes an unsupervised disambiguation method using a prior probability estimation based on the Korean WordNet. This performs better than supervised disambiguation. In the Korean WordNet, all the words have similar semantic characteristics to their related words. Thus, it is assumed that the dependency between words is the same as the dependency between their related words. This resolves the data deficiency problem by determining the dependency between words by calculating the χ2 statistic between related words. Moreover, in order to have the same effect as using the semantic occurrence probability as prior probability, which is used in supervised disambiguation, semantically related words of ambiguous vocabulary are obtained and utilized as prior probability data. An experiment was conducted with Korean, English, and Chinese to evaluate the performance of our proposed lexical disambiguation method. We found that our proposed method had better performance than supervised disambiguation methods even though our method is based on unsupervised disambiguation (using a knowledge-based approach).

Highlights

The present paper addresses lexical disambiguation occurring in the semantic analysis phase of the natural language analysis process that includes cases of ambiguity
This paper proposed a novel unsupervised disambiguation method that showed better performance than existing knowledge-based lexical disambiguation or unsupervised lexical disambiguation methods without need of a large amount of sense-tagged corpus
Since the related words in the Korean Lexical Semantic Network have the same characteristics, the meaning of an ambiguous word could be distinguished by determining the relationship between the semantic relation words of the ambiguous word and the co-occurrence words in a local context

Summary

Introduction

The present paper addresses lexical disambiguation occurring in the semantic analysis phase of the natural language analysis process that includes cases of ambiguity. Lexical disambiguation refers to the determination of the correct semantic meaning for a word that has multiple meanings (hereafter referred to as an ambiguous word) by evaluating the meaning in its context [1]. Lexical disambiguation, which is the same as morphological analysis and syntactic analysis, is essential in natural language processing and plays an important role in various application areas. Lexical disambiguation of a query word can provide the high-quality information that a user needs. If a query word inputted by a user is court, the search engine should present the results by categorizing the information into courthouse-related and palace-related suggestions. It is important to resolve semantic ambiguity in text mining for documents in specialized fields such as medical documents [2,3]

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Nov 26, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Word Sense Disambiguation Using Prior Probability Estimation Based on the Korean WordNet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time.
R A Miller ... H Xu
Applied Clinical Informatics | VOL. 6
R A Miller, et. al.R A Miller ... H Xu
01 Jan 2015
Applied Clinical Informatics | VOL. 6

Word Sense Disambiguation Using Heterogeneous Language Resources
Kiyoaki Shirai ... Takayuki Tamagaki
-
Kiyoaki Shirai, et. al.Kiyoaki Shirai ... Takayuki Tamagaki
01 Jan 2004
01 Jan 2004

Knowledge-Based Method for Word Sense Disambiguation by Using Hindi WordNet
P Sharma ... N Joshi
Engineering, Technology & Applied Science Research | VOL. 9
P Sharma, et. al.P Sharma ... N Joshi
10 Apr 2019
Engineering, Technology & Applied Science Research | VOL. 9

Multiple Heuristics and Their Combination for Automatic WordNet Mapping
Changki Lee ... Gary Geunbae Lee
Computers and the Humanities | VOL. 38
Changki Lee, et. al.Changki Lee ... Gary Geunbae Lee
01 Nov 2004
Computers and the Humanities | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Word Sense Disambiguation Using Prior Probability Estimation Based on the Korean WordNet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics