Cluster-based sparse topical coding for topic mining and document clustering

Parvin Ahmadi,Mahmoud Tabandeh,Iman Gholampour

doi:10.1007/s11634-017-0280-3

Abstract

In this paper, we introduce a document clustering method based on Sparse Topical Coding, called Cluster-based Sparse Topical Coding. Topic modeling is capable of improving textual document clustering by describing documents via bag-of-words models and projecting them into a topic space. The latent semantic descriptions derived by the topic model can be utilized as features in a clustering process. In our proposed method, document clustering and topic modeling are integrated in a unified framework in order to achieve the highest performance. This framework includes Sparse Topical Coding, which is responsible for topic mining, and K-means that discovers the latent clusters in documents collection. Experimental results on widely-used datasets show that our proposed method significantly outperforms the traditional and other topic model based clustering methods. Our method achieves from 4 to 39% improvement in clustering accuracy and from 2% to more than 44% improvement in normalized mutual information.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cluster-based sparse topical coding for topic mining and document clustering

Abstract

Talk to us

Similar Papers

More From: Advances in Data Analysis and Classification

Lead the way for us

Journal: Advances in Data Analysis and Classification	Publication Date: Feb 28, 2017
Citations: 8

Similar Papers

Clustering improvement via integrating with sparse topical coding
Parvin Ahmadi ... Mahmoud Tabandeh
-
Parvin Ahmadi, et. al.Parvin Ahmadi ... Mahmoud Tabandeh
01 May 2015
01 May 2015

Traffic Scene Analysis using Hierarchical Sparse Topical Coding
...
-
, et. al. ...
01 Dec 2018
01 Dec 2018

Beyond bag-of-words: An improved Sparse Topical Coding for learning motion patterns in traffic scenes
Parvin Ahmadi ... Mahmoud Tabandeh
-
Parvin Ahmadi, et. al.Parvin Ahmadi ... Mahmoud Tabandeh
01 Nov 2015
01 Nov 2015

Bayesian Sparse Topical Coding
Min Peng ... Hua Wang
IEEE Transactions on Knowledge and Data Engineering | VOL. 31
Min Peng, et. al.Min Peng ... Hua Wang
01 Jun 2019
IEEE Transactions on Knowledge and Data Engineering | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cluster-based sparse topical coding for topic mining and document clustering

Abstract

Talk to us

Similar Papers

More From: Advances in Data Analysis and Classification