Neural Topic Modeling with Cycle-Consistent Adversarial Training

Xuemeng Hu,Rui Wang,Yuxuan Xiong,Deyu Zhou

doi:10.18653/v1/2020.emnlp-main.725

Abstract

Advances on deep generative models have attracted significant research interest in neural topic modeling. The recently proposed Adversarial-neural Topic Model models topics with an adversarially trained generator network and employs Dirichlet prior to capture the semantic patterns in latent topics. It is effective in discovering coherent topics but unable to infer topic distributions for given documents or utilize available document labels. To overcome such limitations, we propose Topic Modeling with Cycle-consistent Adversarial Training (ToMCAT) and its supervised version sToMCAT. ToMCAT employs a generator network to interpret topics and an encoder network to infer document topics. Adversarial training and cycle-consistent constraints are used to encourage the generator and the encoder to produce realistic samples that coordinate with each other. sToMCAT extends ToMCAT by incorporating document labels into the topic modeling process to help discover more coherent topics. The effectiveness of the proposed models is evaluated on unsupervised/supervised topic modeling and text classification. The experimental results show that our models can produce both coherent and informative topics, outperforming a number of competitive baselines.

Highlights

Topic models, such as Latent Dirichlet Allocation (LDA) (Blei et al, 2003), aim to discover underlying topics and semantic structures from text collections
Due to its interpretability and effectiveness, LDA has been extended to many Natural Language Processing (NLP) tasks (Lin and He, 2009; McAuley and Leskovec, 2013; Zhou et al, 2017)
A document labeled as ‘sports’ more likely belongs to topics such as ‘basketball’ or ‘football’ rather than ‘economics’ or ‘politics’. To address such limitations of Adversarial-neural Topic Model (ATM), we propose a novel neural topic modeling approach, named Topic Modeling with Cycle-consistent Adversarial Training (ToMCAT)

Summary

Introduction

Topic models, such as Latent Dirichlet Allocation (LDA) (Blei et al, 2003), aim to discover underlying topics and semantic structures from text collections. Inspired by variational autoencoder (VAE) (Kingma and Welling, 2013), Miao et al (2016) proposed Neural Variational Document Model which interprets the latent code in VAE as topics Following this way, Srivastava and Sutton (2017) adopted the logistic normal prior rather than Gaussian to mimic the simplex properties of topic distribution. ATM was shown to be effective in discovering coherent topics, it can not be used to induce the topic distribution given a document due to the absence of a topic inference module Such limitation hinders its application to downstream tasks, such as text classification. A document labeled as ‘sports’ more likely belongs to topics such as ‘basketball’ or ‘football’ rather than ‘economics’ or ‘politics’ To address such limitations of ATM, we propose a novel neural topic modeling approach, named Topic Modeling with Cycle-consistent Adversarial Training (ToMCAT). Experimental results on unsupervised/supervised topic modeling and text classification demonstrate the effectiveness of the proposed approaches

Neural Topic Modeling

Unsupervised Style Transfer

Methodology

ToMCAT

Encoder Network E

Generator Network G

Training Objective

Training Details

Experimental Setup

Topic Modeling

Unsupervised Topic Modeling

Supervised Topic Modeling

Impact of Topic Numbers

Text Classification

Conclusion

Findings

A Discovered Topics on NYTimes

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Neural Topic Modeling with Cycle-Consistent Adversarial Training

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 31	License type: cc-by

Similar Papers

Neural Topic Modeling with Bidirectional Adversarial Training
Rui Wang ... Deyu Zhou
-
Rui Wang, et. al.Rui Wang ... Deyu Zhou
01 Jan 2020
01 Jan 2020

Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence
...
-
, et. al. ...
01 Aug 2021
01 Aug 2021

Neural topic models with survival supervision: Jointly predicting time-to-event outcomes and learning how clinical features relate
George H Chen ... Jeremy C Weiss
Artificial Intelligence In Medicine | VOL. 154
George H Chen, et. al.George H Chen ... Jeremy C Weiss
23 May 2024
Artificial Intelligence In Medicine | VOL. 154

Neural Relational Topic Models for Scientific Article Analysis
Haoli Bai ... Zenglin Xu
-
Haoli Bai, et. al.Haoli Bai ... Zenglin Xu
17 Oct 2018
17 Oct 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Neural Topic Modeling with Cycle-Consistent Adversarial Training

Abstract

Highlights

Summary

Talk to us

Similar Papers