Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge

Tong Wang,Graeme Hirst,Abdelrahman Mohamed

doi:10.3115/v1/p15-2075

Abstract

We propose two improvements on lexical association used in embedding learning: factorizing individual dependency relations and using lexicographic knowledge from monolingual dictionaries. Both proposals provide low-entropy lexical cooccurrence information, and are empirically shown to improve embedding learning by performing notably better than several popular embedding models in similarity tasks. 1 Lexical Embeddings and Relatedness Lexical embeddings are essentially real-valued distributed representations of words. As a vectorspace model, an embedding model approximates semantic relatedness with the Euclidean distance between embeddings, the result of which helps better estimate the real lexical distribution in various NLP tasks. In recent years, researchers have developed efficient and effective algorithms for learning embeddings (Mikolov et al., 2013a; Pennington et al., 2014) and extended model applications from language modelling to various areas in NLP including lexical semantics (Mikolov et al., 2013b) and parsing (Bansal et al., 2014). To approximate semantic relatedness with geometric distance, objective functions are usually chosen to correlate positively with the Euclidean similarity between the embeddings of related words. Maximizing such an objective function is then equivalent to adjusting the embeddings so that those of the related words will be geometrically closer. The definition of relatedness among words can have a profound influence on the quality of the resulting embedding models. In most existing studies, relatedness is defined by co-occurrence within a window frame sliding over texts. Although supported by the distributional hypothesis (Harris, 1954), this definition suffers from two major limitations. Firstly, the window frame size is usually rather small (for efficiency and sparsity considerations), which increases the false negative rate by missing long-distance dependencies. Secondly, a window frame can (and often does) span across different constituents in a sentence, resulting in an increased false positive rate by associating unrelated words. The problem is worsened as the size of the window increases since each false-positive n-gram will appear in two subsuming false-positive (n+1)-grams. Several existing studies have addressed these limitations of window-based contexts. Nonetheless, we hypothesize that lexical embedding learning can further benefit from (1) factorizing syntactic relations into individual relations for structured syntactic information and (2) defining relatedness using lexicographic knowledge. We will show that implementation of these ideas brings notable improvement in lexical similarity tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Impact of word embedding models on text analytics in deep learning environment: a review.
Deepak Suresh Asudani ... Naresh Kumar Nagwani
Artificial Intelligence Review | VOL. 56
Deepak Suresh Asudani, et. al.Deepak Suresh Asudani ... Naresh Kumar Nagwani
22 Feb 2023
Artificial Intelligence Review | VOL. 56

Support and Centrality: Learning Weights for Knowledge Graph Embedding Models
Gengchen Mai ... Krzysztof Janowicz
-
Gengchen Mai, et. al.Gengchen Mai ... Krzysztof Janowicz
01 Jan 2018
01 Jan 2018

An empirical assessment of different word embedding and deep learning models for bug assignment
Rongcun Wang ... Rubing Huang
The Journal of Systems & Software | VOL. 210
Rongcun Wang, et. al.Rongcun Wang ... Rubing Huang
06 Jan 2024
The Journal of Systems & Software | VOL. 210

Learning Term Embeddings for Lexical Taxonomies
Jingping Liu ... Haiyun Jiang
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35
Jingping Liu, et. al.Jingping Liu ... Haiyun Jiang
18 May 2021
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge

Abstract

Talk to us

Similar Papers