Abstract
An inherent property of natural languages is the possibility of distinct meanings for the same word in different sentences. Word sense induction (WSI) is the unsupervised process of discovering the meanings of a word. The meanings form a sense inventory, which is used for word sense disambiguation (WSD). Fuzzy logic’s capability at uncertainty representation makes it perfectly applicable for handling the vague information processed in natural languages for WSI and WSD. In this article, a novel fuzzy-based methodology is proposed for extracting meaningful information from ambiguous words, where both word senses and sense inventories are modeled as linguistic variables. The proposed method aims to gather a term set of level-2 fuzzy values for the variables representing words’ meanings, to achieve WSI. The values in the term set are, then, used for linguistic approximation using a fuzzy inference system designed for WSD based on word’s context. The fuzzy word senses are extracted from an input corpus by word substitution, i.e., predicting words suitable as substitutes for the target word using masked language models. These fuzzy substitute sets are, then, clustered to discover similarities in the semantics they represent. Finally, each cluster is reformed into a sense value and added to the term set for the target word. The experimental results show that the proposed system outperforms the systems submitted to the standard SemEval 2010 and 2013 WSI and WSD tasks and achieves comparable performance with other fuzzy and nonfuzzy state-of-the-art methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have