Abstract
Additive compositionality of word embedding models has been studied from empirical and theoretical perspectives. Existing research on justifying additive compositionality of existing word embedding models requires a rather strong assumption of uniform word distribution. In this paper, we relax that assumption and propose more realistic conditions for proving additive compositionality, and we develop a novel word and sub-word embedding model that satisfies additive compositionality under those conditions. We then empirically show our model’s improved semantic representation performance on word similarity and noisy sentence similarity.
Highlights
IntroductionPrevious word embedding studies have empirically shown linguistic regularities represented as linear translation in the word vector space, but they do not explain these empirical results mathematically (Mikolov et al, 2013b; Pennington et al, 2014; Bojanowski et al, 2017)
We evaluate the semantic representation performance with the word similarity task, and we show the robustness of our model by conducting word similarity task on various vocabulary size and sentence similarity task on various noisy settings
We proposed 1) novel theoretical conditions of additive compositional word embedding model and 2) novel word/sub-word embedding model which we call OLIVE that satisfies additive compositionality
Summary
Previous word embedding studies have empirically shown linguistic regularities represented as linear translation in the word vector space, but they do not explain these empirical results mathematically (Mikolov et al, 2013b; Pennington et al, 2014; Bojanowski et al, 2017). Arora et al (2016) propose a generative model to explain PMI-based distributional models and presents a mathematical explanation of the linguistic regularity in SkipGram (Mikolov et al, 2013a). We propose a novel word/sub-word embedding model which we call OLIVE that satisfies exact additive compositionality. By being a more theoretically sound word embedding model OLIVE shows improved empirical performance for semantic representation of word vectors, and by eliminating sampling process in SGNS, OLIVE shows robustness on the size of the vocabulary. We evaluate the semantic representation performance with the word similarity task, and we show the robustness of our model by conducting word similarity task on various vocabulary size and sentence similarity task on various noisy settings. We propose a word/sub-word embedding model that theoretically satisfies additive compositionality. In addition to theoretical justification, we show the empirical performance of our model and its robustness
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.