A Semantic Based Approach for Topic Evaluation in Information Filtering

Yue Xu,Yuefeng Li,Hanh Nguyen

doi:10.1109/access.2020.2985079

Yue Xu, Yuefeng Li + Show 1 more

Open Access

PDF Available

https://doi.org/10.1109/access.2020.2985079

Copy DOI

Export

Save

Cite

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 10	License type: CC BY 4.0

Affiliation: Queensland University of Technology

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Topic Modelling has been successfully applied in many text mining applications such as natural language processing, information retrieval, information filtering, etc. In information filtering systems (IFs), user interest representation is the core part which determines the success of the system. Topics in a topic model generated from a user's documents can be used to represent the user's information interest. However, the quality of a topic model generated from a document collection is not always accurate because the topics of the topic model might contain meaningless or ambiguous words. This ambiguity problem can affect the performance of IFs which use a topic model to represent user information interest. Hence, a topic evaluation method to assess the quality of topics in a topic model is important for ensuring the effectiveness of utilizing the topic model in text mining applications. One method in measuring the quality of a topic model is to match the topical words of the model to concepts in an ontology. However, a limitation of this method is that some topical words in an examined topic cannot be found in the mapping ontology. In this study, we propose a new model to evaluate the quality of topics by matching concepts in an ontology. In particular, word embedding technique is applied to dealing with the ambiguity problem by finding similar concept words based on word embeddings. The assessed topics are then used in an information filtering system for filtering relevant documents for a user. The proposed model was evaluated against some state-of-the-art baseline models in terms of term-based, phrase-based, and topic-based user interest representations, and also some topic evaluation models. The result of the evaluation shows that the new proposed model outperforms the state-of-the-art baseline models.

Highlights

The past decade has seen the rapid development of topic modelling in understanding text corpus
THE PROPOSED TOPIC EVALUATION MODEL This paper proposes a model, named Semantic based Topic Evaluation (SbTE), to evaluate the quality of topics generated from a document collection based on the semantics of the documents
We evaluated the performance of topic evaluation by applying the assessed topics to document ranking in information filtering systems

Summary

Introduction

The past decade has seen the rapid development of topic modelling in understanding text corpus. Among the stateof-the-art models, Latent Dirichlet Allocation LDA [1]–[3] is the most popular technique, which provides an explicit representation of documents. In LDA, documents can be represented by a probability distribution of topics and each topic is a probability distribution of words. The topic model based document representation has been successfully applied to many text mining applications. The topics generated by LDA still have limitations. Ambiguous or meaningless topical words and topics were reported in [4] as a common limitation of topic models in general. Many topical words are ambiguous and noisy [4].

Objectives

Methods

Findings

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

A Semantic Based Approach for Topic Evaluation in Information Filtering

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Comparative Study of Topic Modeling and Word Embedding Approaches for Web Service Clustering
Neha Agarwal ... Geeta Sikka
-
Neha Agarwal, et. al.Neha Agarwal ... Geeta Sikka
05 Aug 2021
05 Aug 2021

A Semantic Similarity Based Topic Evaluation for Enhancing Information Filtering
Hanh Nguyen ... Yuefeng Li
-
Hanh Nguyen, et. al.Hanh Nguyen ... Yuefeng Li
01 Dec 2018
01 Dec 2018

Word embedding empowered topic recognition in news articles
Sidrah Kaleem ... Moutaz Alazab
PeerJ Computer Science | VOL. 10
Sidrah Kaleem, et. al.Sidrah Kaleem ... Moutaz Alazab
11 Dec 2024
PeerJ Computer Science | VOL. 10

Multi-scaled Topic Embedding for Text Classification
Jiaheng Zhang ... Pengfei Li
-
Jiaheng Zhang, et. al.Jiaheng Zhang ... Pengfei Li
24 Jun 2022
24 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

A Semantic Based Approach for Topic Evaluation in Information Filtering

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access