Deep Learning Models For Word Sense Disambiguation: A Comparative Study

Sandeep Nithyanandan,Raseek C

doi:10.2139/ssrn.3437615

Abstract

A specific word can have different meanings depending on the context in which it appears. Identifying the proper sense of the word is crucial in many tasks such as Machine Translation, Anaphora Resolution, Search Engine Recommendation. A word with wrong sense can make the whole sentence meaningless. The task of Word Sense Disambiguation (WSD), is to assign a sense to the ambiguous word based on the context. This is a basic problem which needs to be resolved in the field of NLP and has a variety of solutions. Usual approaches involve supervised machine learning techniques which uses bag-of-words approach. But to achieve better results, WSD should move to sequence modeling rather than the traditional bag of words approach for better performance. Deep neural networks have been used for variety of NLP tasks and a Bi-Directional Long Short-Term Memory (BiLSTM) or a Bi-Directional Gated Recurrent Unit (BiGRU) are possible candidate solutions. The proposed model uses one word per BiGRU or BiLSTM approach, where each word has a model trained for disambiguation. This helps in easy modification of existing model whenever needed. A comparative study on various deep learning models is also performed. The evaluation study of the models shows Bi-Directional models outperforming the other deep learning models.

Full Text