Utility-based feature selection for text classification

Heyong Wang,Ming Hong,Raymond Yiu Keung Lau

doi:10.1007/s10115-018-1281-z

Abstract

Feature selection is a significant step before a classification task used to reduce excessive computational costs and enhance classification performance. This paper illustrates a novel feature selection method based on the concept of utility that is grounded in economics theory. In particular, we focus on a utility-based feature selection method for enhancing text classification. Different from existing feature selection methods, the proposed method selects discriminative semantic terms according to how authors utilize terms to express the main ideas in textual documents, i.e., the “utility of terms,” a criteria that can be used to measure the usefulness of terms on expressing authors’ main ideas. To our best knowledge, our work represents the successful research on the leveraging economics theory for developing a semantically rich feature selection method to improve text classification. Our empirical tests based on six UCI benchmark datasets confirm that the proposed method often outperforms other state-of-the-art feature selection methods in text classification. Moreover, our method provides an economics explanation of term weighting for information retrieval and semantic information acquisition in textual documents.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Utility-based feature selection for text classification

Abstract

Talk to us

Similar Papers

More From: Knowledge and Information Systems

Lead the way for us

Journal: Knowledge and Information Systems	Publication Date: Dec 8, 2018
Citations: 4

Similar Papers

Migrating birds optimization-based feature selection for text classification
Cem Kaya ... Zeynep Hilal Kilimci
PeerJ Computer Science | VOL. 10
Cem Kaya, et. al.Cem Kaya ... Zeynep Hilal Kilimci
30 Aug 2024
PeerJ Computer Science | VOL. 10

Research on Feature Selection and kNN Classification Method in Chinese Text Classification
Chao Xiao ... Ping Wu
-
Chao Xiao, et. al.Chao Xiao ... Ping Wu
01 Jan 2015
01 Jan 2015

Information gain and divergence-based feature selection for machine learning-based text categorization
Changki Lee ... Gary Geunbae Lee
Information Processing & Management | VOL. 42
Changki Lee, et. al.Changki Lee ... Gary Geunbae Lee
03 Aug 2005
Information Processing & Management | VOL. 42

A novel redistribution-based feature selection for text classification
Murat Okkalioglu
Expert Systems With Applications | VOL. 246
Murat OkkaliogluMurat Okkalioglu
17 Jan 2024
Expert Systems With Applications | VOL. 246

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Utility-based feature selection for text classification

Abstract

Talk to us

Similar Papers

More From: Knowledge and Information Systems