Semantic text classification: A survey of past and recent advances

Berna Altınel,Murat Can Ganiz

doi:10.1016/j.ipm.2018.08.001

Abstract

Automatic text classification is the task of organizing documents into pre-determined classes, generally using machine learning algorithms. Generally speaking, it is one of the most important methods to organize and make use of the gigantic amounts of information that exist in unstructured textual format. Text classification is a widely studied research area of language processing and text mining. In traditional text classification, a document is represented as a bag of words where the words in other words terms are cut from their finer context i.e. their location in a sentence or in a document. Only the broader context of document is used with some type of term frequency information in the vector space. Consequently, semantics of words that can be inferred from the finer context of its location in a sentence and its relations with neighboring words are usually ignored. However, meaning of words, semantic connections between words, documents and even classes are obviously important since methods that capture semantics generally reach better classification performances. Several surveys have been published to analyze diverse approaches for the traditional text classification methods. Most of these surveys cover application of different semantic term relatedness methods in text classification up to a certain degree. However, they do not specifically target semantic text classification algorithms and their advantages over the traditional text classification. In order to fill this gap, we undertake a comprehensive discussion of semantic text classification vs. traditional text classification. This survey explores the past and recent advancements in semantic text classification and attempts to organize existing approaches under five fundamental categories; domain knowledge-based approaches, corpus-based approaches, deep learning based approaches, word/character sequence enhanced approaches and linguistic enriched approaches. Furthermore, this survey highlights the advantages of semantic text classification algorithms over the traditional text classification algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semantic text classification: A survey of past and recent advances

Abstract

Talk to us

Similar Papers

More From: Information Processing & Management

Lead the way for us

Journal: Information Processing & Management	Publication Date: Aug 20, 2018
Citations: 145

Similar Papers

A Novel Class-Center Vector Model for Text Classification Using Dependencies and a Semantic Dictionary
Xinhua Zhu ... Qingting Xu
IEEE Access | VOL. 8
Xinhua Zhu, et. al.Xinhua Zhu ... Qingting Xu
05 Dec 2019
IEEE Access | VOL. 8

Digital Library Information Integration System Based on Big Data and Deep Learning
Xiao Lin ... Ying Zhang
Journal of Sensors | VOL. 2022
Xiao Lin, et. al.Xiao Lin ... Ying Zhang
01 Jul 2022
Journal of Sensors | VOL. 2022

An Improved Class-Center Method for Text Classification Using Dependencies and WordNet
Xinhua Zhu ... Qingting Xu
-
Xinhua Zhu, et. al.Xinhua Zhu ... Qingting Xu
01 Jan 2019
01 Jan 2019

Co-STM text categorization method based on Supervised Topic Model
Genpeng Zhang ... Xiaoyan Liu
-
Genpeng Zhang, et. al.Genpeng Zhang ... Xiaoyan Liu
01 Mar 2021
01 Mar 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantic text classification: A survey of past and recent advances

Abstract

Talk to us

Similar Papers

More From: Information Processing &amp; Management

More From: Information Processing & Management