Application of Linear Classifier on Chinese Spam Filtering

Yongqin Qiu,Yan Xu,Dan Li

doi:10.4304/jsw.6.1.116-123

Abstract

is a key problem in electronic communication. Especially in large-scale email systems. Content-based filtering is one mainstream method of combating this threat in its forms, an e-mail filtering system can learn directly from a user’s mail set, but the previous Content-based filtering methods are hard to find a balance between efficiency and effectiveness. Such algorithms of text categorization as Naive Bayes, kNN, Decision Tree and Boosting can be applied in spam filtering. However, the effectiveness of Naive Bayes is limited and it is not fit for instant feedback learning. Others algorithm such as SVM are more effective but complicated to compute. Because in a real email system a large volume of emails often need to be handled in a short time, efficiency will often be as important as effectiveness when implementing an anti-spam filtering method. So we intend to find a linear classifier to solve this problem, two online linear classifiers: the Perception and Winnow were explored for this task, which are two fast linear classifiers. The training of these two methods is online and mistake driven. Furthermore, they are suitable for feedback. We employ the two methods in three benchmark corpora, including PU1, Ling spam and 2005-Jun, the experiments in public e-mail corpus show an effective result. We conclude that the two online linear classifiers have a state-of-the-art performance for filtering spam, especially for Chinese spam emails.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Application of Linear Classifier on Chinese Spam Filtering

Abstract

Talk to us

Similar Papers

More From: Journal of Software

Lead the way for us

Similar Papers

Constructing Fast Linear Classifier with Mutual Information
Zhiwei Shi ... Zhongzhi Shi
-
Zhiwei Shi, et. al. Zhiwei Shi ... Zhongzhi Shi
13 Oct 2005
13 Oct 2005

An Online Linear Chinese Spam Emails Filtering System
Yongqin Qiu ... Yan Xu
-
Yongqin Qiu, et. al.Yongqin Qiu ... Yan Xu
01 May 2010
01 May 2010

Rough set and its application in Chinese spam filtering
Yan Xu
-
Yan XuYan Xu
01 Nov 2011
01 Nov 2011

Compact bag-of-words visual representation for effective linear classification
Xiaodan Zhuang ... Shuang Wu
-
Xiaodan Zhuang, et. al.Xiaodan Zhuang ... Shuang Wu
21 Oct 2013
21 Oct 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Application of Linear Classifier on Chinese Spam Filtering

Abstract

Talk to us

Similar Papers

More From: Journal of Software