The task of word sense disambiguation (WSD) plays a key role in multiple applications of natural language processing. In this paper, we propose a novel unsupervised method for targeted Hindi WSD task. First, we create a weighted graph where the nodes correspond to various synsets of the target word and the neighboring context words. The edges in the graph represent the semantic relations between these synsets in the Hindi WordNet hierarchy. A path-based similarity measure, namely Leacock-Chodorow similarity measure, is used to assign weights to edges. An unsupervised weighted graph-based centrality algorithm is used to identify the correct sense of a target word in a given context. The performance of the proposed algorithm is measured on 20 ambiguous Hindi nouns using four different graph-based centrality measures. We observed a maximum accuracy of 66.92% using PageRank centrality measure which is significantly better than earlier reported graph-based Hindi WSD algorithmsevaluated on the same dataset.
Read full abstract