A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information

Xi Zou,Xiaodong Li,Jiamin Lu,Jun Feng,Yuelong Zhu

doi:10.1109/access.2019.2960468

Xi Zou, Xiaodong Li + Show 3 more

Open Access

https://doi.org/10.1109/access.2019.2960468

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 30	License type: CC BY 4.0

Affiliation: Hohai University

Abstract

Hierarchical topic models, such as hierarchical Latent Dirichlet Allocation (hLDA)and its variations, can organize topics into a hierarchy automatically. On the other hand, there are lots of documents associated with hierarchical label information. Incorporating these information into the topic modeling process can help users to obtain a more reasonable hierarchical structure. However, after analyzing various real-world datasets, we find that these hierarchical labels are ambiguous and conflicting in some levels, which introduces error and restriction to the latent topic and the hierarchical structure exploration process. We call it the horizontal topic expansion problem. To address this problem, in this paper, we propose a novel hierarchical topic model named horizontal and vertical hierarchical topic model (HV-HTM), which aims to incorporate the observed hierarchical label information into the topic generation process, while keeping the flexibility of horizontal and vertical expansion of the hierarchical structure in the modeling process. We conduct experiments on BBC news and Yahoo! Answers datasets and evaluate the effectiveness of HV-HTM on three evaluation metrics. The experimental results show that HV-HTM has a significant improvement on topic modeling, compared to the state-of-the-art models, and it can also obtain a more interpretable hierarchical structure.

Highlights

Topic modeling is one of the most popular research areas in Natural Language Process (NLP), which aims at digging out the latent topics from a large collection of documents
We focus on hierarchical topic models incorporating observed hierarchical label information and how to expand the topical tree horizontally and vertically
The runtime and memory usage of hierarchical Latent Dirichlet Allocation (hLDA) show a dramatic increase, much more than that of Supervised Hierarchical Latent Dirichlet Allocation (SSHLDA) and horizontal and vertical hierarchical topic model (HV-HTM). These results indicate that HV-HTM has the same running performance as SSHLDA and is much better than hLDA

Summary

Introduction

Topic modeling is one of the most popular research areas in Natural Language Process (NLP), which aims at digging out the latent topics from a large collection of documents. Topic models, such as Latent Dirichlet allocation (LDA) [1], have been proven to be useful in extracting latent topics. Hierarchical topic models, like hierarchical Latent Dirichlet Allocation (hLDA) [2], are proposed to relax this restriction. Those models make use of the Chinese restaurant process and

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Bayesian nonparametric inference of latent topic hierarchies for multimodal data
Takuji Shimamawari ... Koji Eguchi
-
Takuji Shimamawari, et. al.Takuji Shimamawari ... Koji Eguchi
01 Nov 2015
01 Nov 2015

Hierarchical lifelong topic modeling using rules extracted from network communities.
Muhammad Taimoor Khan ... Furqan Aziz
PloS one | VOL. 17
Muhammad Taimoor Khan, et. al.Muhammad Taimoor Khan ... Furqan Aziz
03 Mar 2022
PloS one | VOL. 17

Hierarchical lifelong topic modeling using rules extracted from network communities
Furqan Aziz ... Diego Raphael Amancio
-
Furqan Aziz, et. al.Furqan Aziz ... Diego Raphael Amancio
03 Mar 2022
03 Mar 2022

Exploiting Contextual Embeddings in Hierarchical Topic Modeling and Investigating the Limits of the Current Evaluation Metrics
Felipe Viegas ... Marcos André Gonçalves
Computational Linguistics | VOL. -
Felipe Viegas, et. al.Felipe Viegas ... Marcos André Gonçalves
16 Oct 2024
Computational Linguistics | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access