Abstract

The classical bag-of-words and probabilistic topic models are widely used on topic classification tasks. Recently, neural networks have achieved remarkable performance and formed the mainstream, due to their ability to encode distributed semantic features of documents based on word embeddings. To demonstrate the superiority of neural networks, this paper compares Latent Dirichlet Allocation (LDA) with Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and Recurrent Convolutional Neural Network (RCNN), which are the mainstream neural network architectures. Beyond this, we combine the latent topic information inferred by LDA and distributed semantic information learned by neural networks to generate a better document representation for topic classification. The experimental results show that the proposed representation outperforms individual systems and can achieve excellent performance on topic classification tasks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call