Abstract
In recent years, there has been an increasing interest in learning a distributed representation of word sense. Traditional context clustering based models usually require careful tuning of model parameters, and typically perform worse on infrequent word senses. This paper presents a novel approach which addresses these limitations by first initializing the word sense embeddings through learning sentence-level embeddings from WordNet glosses using a convolutional neural networks. The initialized word sense embeddings are used by a context clustering based model to generate the distributed representations of word senses. Our learned representations outperform the publicly available embeddings on half of the metrics in the word similarity task, 6 out of 13 sub tasks in the analogical reasoning task, and gives the best overall accuracy in the word sense effect classification task, which shows the effectiveness of our proposed distributed distribution learning model.
Highlights
The representation of knowledge has become focal areas in natural language processing
The main contributions of this work are as follows: (1) we propose to use a sentence composition model to capture word sense from a knowledge base, e.g., WordNet; (2) while previous approaches to sense vector clustering often adopted random initialization, we propose to initialize sense vectors and the number of sense clusters with the word sense knowledge learned from WordNet for better clustering results; (3) we further verify our learned distributed word sense representations on three different tasks, word similarity measurement, analogical reasoning and word sense effect classification
To deal with this problem, we propose to incorporate the sense vectors learned from WordNet glosses by convolutional neural network (CNN) composition as prior knowledge into a context clustering model such as the MSSG model proposed by Neelakantan et al [4]
Summary
The representation of knowledge has become focal areas in natural language processing. A word sense is represented as a dense and real-valued vector To this end, most existing approaches adopted a cluster-based paradigm, which produces different sense vectors for each polysemy or homonymy through clustering the context of the target words. We modify the MSSG algorithm for context clustering by initializing the sense vectors with the representations learned by our CNN-based sentence composition model. The main contributions of this work are as follows: (1) we propose to use a sentence composition model to capture word sense from a knowledge base, e.g., WordNet; (2) while previous approaches to sense vector clustering often adopted random initialization, we propose to initialize sense vectors and the number of sense clusters with the word sense knowledge learned from WordNet for better clustering results; (3) we further verify our learned distributed word sense representations on three different tasks, word similarity measurement, analogical reasoning and word sense effect classification.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.