Abstract

Automatic text classification, which is defined as the process of automatically classifying texts into predefined categories, has many applications in our everyday life and it has recently gained much attention due to the in-creased number of text documents available in electronic form. Classifying News articles is one of the applications of text classification. Automatic classification is a subset of machine learning techniques in which a classifier is built by learning from some pre-classified documents. Naive Bayes and k-Nearest Neighbor are among the most common algorithms of machine learning for text classification. In this paper, we suggest a way to improve the performance of a text classifier using Mutual information and Chi-square feature selection algorithms. We have observed that MI feature selection method can improve the accuracy of Naive Bayes classifier up to 10%. Experimental results show that the proposed model achieves an average accuracy of 80% and an average F1-measure of 80%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call