Computing Sentence Embedding by Merging Syntactic Parsing Tree and Word Embedding

Yong Wang,Lan Tao,Maosheng Zhong,Shuixiu Wu

doi:10.1007/978-3-030-57884-8_2

Abstract

Recent progress in using deep learning for training word embedding has motivated us to explore the research of semantic representation in long texts, such as sentences, paragraphs and chapters. The existing methods typically use word weights and word vectors to calculate sentence embedding. However, these methods lose the word order and the syntactic structure information of sentences. This paper proposes a method for sentence embedding based on the results of syntactic parsing tree and word vectors. We propose the SynTree-WordVec method for deriving sentence embedding, which merges word vectors and the syntactic structure from the Stanford parser. The experimental results show the potential to solve the shortcomings of existing methods. Compared to the traditional sentence embedding weighting method, our method achieves better or comparable performance on various text similarity tasks, especially with the low dimension of the data set.

Full Text