Towards enhancing centroid classifier for text classification—A border-instance approach

Deqing Wang,Junjie Wu,Hui Zhang,Ke Xu,Mengxiang Lin

doi:10.1016/j.neucom.2012.08.019

Abstract

Text classification/categorization (TC) is to assign new unlabeled natural language documents to the predefined thematic categories. Centroid-based classifier (CC) has been widely used for TC because of its simplicity and efficiency. However, it has also been long criticized for its relatively low classification accuracy compared with state-of-the-art classifiers such as support vector machines (SVMs). In this paper, we find that for CC using only border instances rather than all instances to construct centroid vectors can obtain higher generalization accuracy. Along this line, we propose Border-Instance-based Iteratively Adjusted Centroid Classifier (IACC_BI), which relies on the border instances found by some routines, e.g. 1-Nearest-and-1-Furthest-Neighbors strategy, to construct centroid vectors for CC. IACC_BI then iteratively adjusts the initial centroid vectors according to the misclassified training instances. Our extensive experiments on 11 real-world text corpora demonstrate that IACC_BI improves the performance of centroid-based classifiers greatly and obtains classification accuracy competitive to the well-known SVMs, while at significantly lower computational costs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Towards enhancing centroid classifier for text classification—A border-instance approach

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Sep 28, 2012
Citations: 18

Similar Papers

Supervised term weighting centroid-based classifiers for text categorization
Tam T. Nguyen ... Siu Cheung Hui
Knowledge and Information Systems | VOL. 35
Tam T. Nguyen, et. al.Tam T. Nguyen ... Siu Cheung Hui
09 Sep 2012
Knowledge and Information Systems | VOL. 35

A new Centroid-Based Classification model for text categorization
Chuan Liu ... Fengmao Lv
Knowledge-Based Systems | VOL. 136
Chuan Liu, et. al.Chuan Liu ... Fengmao Lv
30 Aug 2017
Knowledge-Based Systems | VOL. 136

A Methodology Combining Cosine Similarity with Classifier for Text Classification
Kwangil Park ... Wooju Kim
Applied Artificial Intelligence | VOL. 34
Kwangil Park, et. al.Kwangil Park ... Wooju Kim
08 Feb 2020
Applied Artificial Intelligence | VOL. 34

CLASSIFICATION OF MEDICAL IMAGES USING MACHINE LEARNING
Eduardo Perez Careta ... Miguel Torres Cisneros
DYNA | VOL. 97
Eduardo Perez Careta, et. al.Eduardo Perez Careta ... Miguel Torres Cisneros
01 Jan 2021
DYNA | VOL. 97

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards enhancing centroid classifier for text classification—A border-instance approach

Abstract

Talk to us

Similar Papers

More From: Neurocomputing