Short Text Representation Model Construction Method Based on Novel Semantic Aggregation Technology

Dong Yi,Zhai Jia,Li Xin,Chen Feng

doi:10.1007/978-981-15-1922-2_7

Abstract

The semantic representation model of short texts has insufficient semantic representation ability, and the semantic representation method of short text based on the combination of word embedding and semantic weight is low in computational complexity and its performance is even better than that based on complex structure such as RNN and LSTM. This paper proposes a semantic representation model of short text based on ELMO (Embeddings from Language Models). The innovation of this model is: firstly, it is adopted the more advanced word embedding model ELMO; secondly, it is designed the semantic keyword extraction method of short text based on the topic model (Latent Dirichlet Allocation, LDA); thirdly, the stochastic gradient descent (SGD) is adopted, which is used to learn the semantic weights of semantic keywords in short texts. The experimental results show that compared with the existing short text semantic representation model, the representation model of short text, which is proposed in this paper, shows a high semantic representation ability of short text in specific domain and in the open domain.

Full Text