Text categorization: past and present

Ankita Dhar,Niladri Sekhar Dash,Himadri Mukherjee,Kaushik Roy

doi:10.1007/s10462-020-09919-1

Abstract

Automatic text categorization is the operation of sorting out the text documents into pre-defined text categories using some machine learning algorithms. Normally, it defines the most important approaches to organizing and making the use of a large volume of information exists in unstructured form. Nowadays, text categorization is becoming an extensively researched field of text mining and processing of languages. Word sense, semantic relationships among terms, text documents and categories are quite essential in order of enhancing the performances of categorization. Various surveys on text categorization have already been available which involve techniques of various text representation schemes to such extent but do not include several approaches that have been explored in text categorization over the standard techniques. Here, an exhaustive analysis of different text categorization approaches over the conventional approaches has been undertaken. This survey paper explores a wide variety of algorithms used for categorizing text documents and tries to assemble the existing works into three basic fields: conventional methods, fuzzy logic-based methods, deep learning-based methods. Further, conventional methods have been categorized into three fields: text categorization using handcrafted features, text categorization using nature-inspired algorithms and text categorization using graph-based methods. Furthermore, this survey provides a clear idea about the available libraries used for different algorithms, availability of datasets, categorization technologies explored in various non-Indian and Indian languages as well.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Text categorization: past and present

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence Review

Lead the way for us

Journal: Artificial Intelligence Review	Publication Date: Sep 30, 2020
Citations: 50

Similar Papers

Some Investigations on Machine Learning Techniques for Automated Text Categorization
Bhagirath Prajapati ... Sanjay Garg
International Journal of Computer Applications | VOL. 71
Bhagirath Prajapati, et. al.Bhagirath Prajapati ... Sanjay Garg
26 Jun 2013
International Journal of Computer Applications | VOL. 71

Classifying text documents by associating terms with text categories
...
-
, et. al. ...
01 Jan 2002
01 Jan 2002

Text Categorization Techniques and Current Trends
Abhisu Jain ... Vikrant Singh
International Journal of Engineering and Advanced Technology | VOL. 9
Abhisu Jain, et. al.Abhisu Jain ... Vikrant Singh
30 Jun 2020
International Journal of Engineering and Advanced Technology | VOL. 9

A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm
Harun Uğuz
Knowledge-Based Systems | VOL. 24
Harun UğuzHarun Uğuz
29 Apr 2011
Knowledge-Based Systems | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text categorization: past and present

Abstract

Talk to us

Similar Papers

More From: Artificial Intelligence Review