The Chinese Text Categorization System with Category Priorities

Huan-Chao Keh,Hui-Hua Huang,Ding-An Chiang,Chih-Cheng Hsu

doi:10.4304/jsw.5.10.1137-1143

The Chinese Text Categorization System with Category Priorities

Huan-Chao Keh, Hui-Hua Huang + Show 2 more

https://doi.org/10.4304/jsw.5.10.1137-1143

Copy DOI

Journal: Journal of Software	Publication Date: Jan 10, 2010
Citations: 1

Affiliation: Tamkang University

#Text Categorization #Category Priority + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

The process of text categorization involves some understanding of the content of the documents and/or some previous knowledge of the categories. For the content of the documents, we use a filtering measure for feature selection in our Chinese text categorization system. We modify the formula of Term Frequency-Inverse Document Frequency (TF-IDF) to strengthen important keywords’ weights and weaken unimportant keywords’ weights. For the knowledge of the categories, we use category priority to represent the relationship between two different categories. Consequently, the experimental results show that our method can effectively not only decrease noise text but also increase the accuracy rate and recall rate of text categorization.

Full Text