Use of Ontology to Support Concept-Based Text Categorization

Yen-Hsien Lee,Tsai-Hsin Chu,Wan-Jung Tsao

doi:10.1007/978-3-642-01256-3_17

Abstract

Huge volumes of worldwide accessible information have led to the tool necessity for better handling of massive information to overcome the conventional manual method. Thus, automated text categorization technique serves to support a more effective document organization management. Fundamentally, conventional text categorization techniques concentrate on the analysis of document contents and measure the similarity based on the overlap among the features of unlabeled documents and that of pre-classified documents. However, such feature-based approach will be confront with the problems of word mismatch and word ambiguity. To lessen these problems, this study proposes an ontology-based text categorization technique. It employs the specific domain ontology to enable documents to be classified in accordance to their range of relevant concepts. The effectiveness of the proposed technique is measured and compared with its benchmark techniques. The evaluation results suggest our proposed technique is more effective than the benchmarks.KeywordsDocument-category managementConcept-based text categorizationOntologyk-nearest neighbors

Full Text