An Elaboration of Text Categorization and Automatic Text Classification Through Mathematical and Graphical Modelling

Ahmed Faraz

doi:10.5121/cseij.2015.5301

Abstract

As the time goes on and on, digitization of text has been increasing enormously and the need to organize, categorize and classify text has become indispensable. Disorganization and very little categorization and classification of text may result in slower response time of text or information retrieval. Therefore it is very important and essential to organize, categorize and classify texts and digitized documents according to definitions proposed by text mining experts and computer scientists. Work has been done on Text Mining, Text Categorization and Automatic Text Classification by computer and information scientists, but obviously a lot of space for novel research in this domain is available. In this paper we have proposed the mathematical notation and graphical models for Text Mining, Text Categorization and Automatic Text Classification to get in depth understanding of these techniques and concepts. Introduction and proposal of mathematical and graphical models for Text Mining, Text Categorization and Automatic Text Classification will shorten the response time of text and information retrieval. Also the performance of web search engines can be improved so much by employing these mathematical and graphical models.

Highlights

In the last fifteen years, content-based document management system has obtained outstanding status in the field of Computer and Information Systems Engineering and Computer Science
When we talk about real world applications of Text Categorization (TC) in the era from early 60s to late 80s, a lot of work had been done on Knowledge Engineering, which is an approach to Text Categorization (TC).The method adopted in Knowledge Engineering was that if someone wanted to classify documents under given categories, the experts knowledge was being encoded in the form of rules or a set of rules manually
Automatic Text Classification (ATC) The readers should have very clear concept in their minds that there is a difference between Automatic Text Classification (ATC) and Text Categorization (TC).We have proposed the new definitions of Automatic Text Classification (ATC) here which are different from the definitions from the literature

Summary

INTRODUCTION

In the last fifteen years, content-based document management system has obtained outstanding status in the field of Computer and Information Systems Engineering and Computer Science. There are two reasons for this popularity of content-based management system. Consider an example of a room having a lot of things and accessories scattered in different directions. If one wants to search an item in this room he or she has to do a lot of efforts because of disorganization of items and human being’s tendency to be confused by seeing a lot of things gathered together. If all the things are organized and placed on their appropriate locations, search will be easy and fast. If the text is categorized and documents are classified among categories, search and retrieval of text will be fast and efficient

TEXT CATEGORIZATION

Knowledge Engineering Approach

Machine Learning Paradigm Approach

Advantages of Machine Learning Paradigm Approach

TEXT MINING

Mathematical Notation of Automatic Text Classification

FUTURE RESEARCH WORK

CONCLUSIONS

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer Science & Engineering: An International Journal	Publication Date: Jun 30, 2015
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

An Elaboration of Text Categorization and Automatic Text Classification Through Mathematical and Graphical Modelling

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer Science & Engineering: An International Journal

Lead the way for us

Similar Papers

Some Investigations on Machine Learning Techniques for Automated Text Categorization
Bhagirath Prajapati ... Sanjay Garg
International Journal of Computer Applications | VOL. 71
Bhagirath Prajapati, et. al.Bhagirath Prajapati ... Sanjay Garg
26 Jun 2013
International Journal of Computer Applications | VOL. 71

Automatic text classification of drug-induced liver injury using document-term matrix and XGBoost.
Minjun Chen ... Wenjun Bao
Frontiers in artificial intelligence | VOL. 7
Minjun Chen, et. al.Minjun Chen ... Wenjun Bao
03 Jun 2024
Frontiers in artificial intelligence | VOL. 7

A Complementary Quadriphase Jacket Sequence Design for DS-CDMA System
Jia Hou ... Moon Ho Lee
-
Jia Hou, et. al.Jia Hou ... Moon Ho Lee
01 Oct 2007
01 Oct 2007

Using scatterplots to understand and improve probabilistic models for text categorization and retrieval
Giorgio Maria Di Nunzio
International Journal of Approximate Reasoning | VOL. 50
Giorgio Maria Di NunzioGiorgio Maria Di Nunzio
24 Jan 2009
International Journal of Approximate Reasoning | VOL. 50

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Elaboration of Text Categorization and Automatic Text Classification Through Mathematical and Graphical Modelling

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer Science & Engineering: An International Journal