Abstract

In recent years, deep compositional models have emerged as a popular technique for representation learning of sentence in computational linguistic and natural language processing. These models normally train various forms of neural networks on top of pretrained word embeddings using a task-specific corpus. However, most of these works neglect the multisense nature of words in the pretrained word embeddings. In this paper we introduce topic models to enrich the word embeddings for multisenses of words. The integration of the topic model with various semantic compositional processes leads to topic-aware convolutional neural network and topic-aware long short term memory networks. Different from previous multisense word embeddings models that assign multiple independent and sense-specific embeddings to each word, our proposed models are lightweight and have flexible frameworks that regard word sense as the composition of two parts: a general sense derived from a large corpus and a topic-specific sense derived from a task-specific corpus. In addition, our proposed models focus on semantic composition instead of word understanding. With the help of topic models, we can integrate the topic-specific sense at word-level before the composition and sentence-level after the composition. Comprehensive experiments on five public sentence classification datasets are conducted and the results show that our proposed topic-aware deep compositional models produce competitive or better performance than other text representation learning methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call