A Novel Fuzzy based Clustering Algorithm for Text Classification

A Krishnamohan,V V Narasimha Rao,M H M Krishna Prasad

doi:10.5120/7211-9998

Abstract

Due to the flourish of World Wide Web and the rapid development of the Internet technology, the increasing volume of digital textual data become more and more unmanageable, therefore the importance of text classification has gained significant attention. Text classification pose some specific challenges such as high dimensionality with each document (data point) having only a very small subset of them and representing multiple labels at the same time. Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text classification. Many researchers worked on Feature Clustering for efficient text classification. Recently a Fuzzy based feature clustering was proposed in which Gaussian distribution is used for fuzzy membership function for clustering. But the problem of skewness may occur with this distribution. To overcome that we propose an efficient Fuzzy similarity based membership function for efficient clustering and with this proposed algorithm satisfactory results obtained.

Full Text