An improved K-nearest-neighbor algorithm for text categorization

Shengyi Jiang,Guansong Pang,Meiling Wu,Limin Kuang

doi:10.1016/j.eswa.2011.08.040

Shengyi Jiang, Guansong Pang + Show 2 more

Open Access

https://doi.org/10.1016/j.eswa.2011.08.040

Copy DOI

Abstract

Text categorization is a significant tool to manage and organize the surging text data. Many text categorization algorithms have been explored in previous literatures, such as KNN, Naïve Bayes and Support Vector Machine. KNN text categorization is an effective but less efficient classification method. In this paper, we propose an improved KNN algorithm for text categorization, which builds the classification model by combining constrained one pass clustering algorithm and KNN text categorization. Empirical results on three benchmark corpora show that our algorithm can reduce the text similarity computation substantially and outperform the-state-of-the-art KNN, Naïve Bayes and Support Vector Machine classifiers. In addition, the classification model constructed by the proposed algorithm can be updated incrementally, and it has great scalability in many real-word applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Expert Systems with Applications	Publication Date: Aug 7, 2011
Citations: 256	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

An improved K-nearest-neighbor algorithm for text categorization

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Similar Papers

Multi-label Arabic text categorization: A benchmark and baseline comparison of multi-label learning algorithms
Bassam Al-Salemi ... Shahrul Azman Mohd Noah
Information Processing & Management | VOL. 56
Bassam Al-Salemi, et. al.Bassam Al-Salemi ... Shahrul Azman Mohd Noah
22 Oct 2018
Information Processing & Management | VOL. 56

Experiments on Supervised Learning Algorithms for Text Categorization
S.M Namburu ... Jianhui Luo
-
S.M Namburu, et. al.S.M Namburu ... Jianhui Luo
01 Jan 2004
01 Jan 2004

An algorithm for text categorization with SVM
Hu Jun ... Huang Houkuan
-
Hu Jun, et. al. Hu Jun ... Huang Houkuan
28 Oct 2002
28 Oct 2002

POLYNOMIAL NETWORKS VERSUS OTHER TECHNIQUES IN TEXT CATEGORIZATION
Mayy M Al-Tahrawi ... Raed Abu Zitar
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 22
Mayy M Al-Tahrawi, et. al.Mayy M Al-Tahrawi ... Raed Abu Zitar
01 Mar 2008
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An improved K-nearest-neighbor algorithm for text categorization

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications