An effective term weighting method using random walk model for text classification

Md Rafiqul Islam,Md Rakibul Islam

doi:10.1109/iccitechn.2008.4803000

Abstract

Text classification may be viewed as assigning texts in a predefined set of categories. However there are many digital documents that are not organized according to their contents. So it is difficult task to find relevant documents for a user. Automatic text classification problem can solve this problem. In this paper we introduce a new random walk term weighting method for improved text classification. In our approach to weight a term, we exploit the relationship of local (term position, term frequency) and global (inverse document frequency, information gain) information of terms (vertices). Moreover, we weight terms by considering co-occurrence and semantic relation of terms as a measure of dependency. To evaluate our term weighting approach we integrate it in Rocchio text classification algorithm and experimental results show that our method performs better than other random walk models.

Full Text