Feature selection for text classification: A review

Xuelian Deng,Yuqing Li,Jian Weng,Jilian Zhang

doi:10.1007/s11042-018-6083-5

Abstract

Big multimedia data is heterogeneous in essence, that is, the data may be a mixture of video, audio, text, and images. This is due to the prevalence of novel applications in recent years, such as social media, video sharing, and location based services (LBS), etc. In many multimedia applications, for example, video/image tagging and multimedia recommendation, text classification techniques have been used extensively to facilitate multimedia data processing. In this paper, we give a comprehensive review on feature selection techniques for text classification. We begin by introducing some popular representation schemes for documents, and similarity measures used in text classification. Then, we review the most popular text classifiers, including Nearest Neighbor (NN) method, Naive Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), and Neural Networks. Next, we survey four feature selection models, namely the filter, wrapper, embedded and hybrid, discussing pros and cons of the state-of-the-art feature selection approaches. Finally, we conclude the paper and give a brief introduction to some interesting feature selection work that does not belong to the four models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Feature selection for text classification: A review

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications

Lead the way for us

Journal: Multimedia Tools and Applications	Publication Date: May 8, 2018
Citations: 247

Similar Papers

Feature selection for text classification using genetic algorithms
Noria Bidi ... Zakaria Elberrichi
-
Noria Bidi, et. al.Noria Bidi ... Zakaria Elberrichi
01 Nov 2016
01 Nov 2016

Feature selection for text classification with Naïve Bayes
Jingnian Chen ... Youli Qu
Expert Systems with Applications | VOL. 36
Jingnian Chen, et. al.Jingnian Chen ... Youli Qu
24 Jun 2008
Expert Systems with Applications | VOL. 36

Information gain and divergence-based feature selection for machine learning-based text categorization
Changki Lee ... Gary Geunbae Lee
Information Processing & Management | VOL. 42
Changki Lee, et. al.Changki Lee ... Gary Geunbae Lee
03 Aug 2005
Information Processing & Management | VOL. 42

Modified Pointwise Mutual Information-Based Feature Selection for Text Classification
Tsvetanka Georgieva-Trifonova
-
Tsvetanka Georgieva-TrifonovaTsvetanka Georgieva-Trifonova
04 Nov 2021
04 Nov 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Feature selection for text classification: A review

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications