Abstract

A text classification system’s learning is substantially dependent on the input features and their process of extraction and selection. The solitary drive encouraging feature selection practice is to lessen the dimensionality of the problem at hand; thus, facilitating the process of classification. Among several problem areas, text categorization is one area where feature selection plays a vital role. It is well-known that text categorization suffers from the curse of dimensionality. This results in the creation of feature space which may have redundant or irrelevant features leading to the creation of a poor classifier. Therefore, to build an intelligent classifier feature, selection is an important process. This paper has a fourfold objective: Firstly, it aims to create a word to vector space using a widely used score. Secondly, it intends to optimize text feature space using a nature-inspired algorithm. Thirdly, it aims at comparing classification performance of three prominently used classifiers, SVM, Naive Bayes, and k-Nearest Neighbors in the area of text classification. Lastly, it targets to compare metrics. Besides accuracy, to understand the consequence of optimizing feature space using nature-inspired algorithm. Standard text classification dataset, Reuters-21578, was used, and the classification accuracies reached 95.07%, 92.23%, 87.37% for SVM, Naive Bayes, and k-Nearest Neighbors, respectively. Besides accuracy, precision, recall, and F-measure were the performance metrics. Considering the encouraging results achieved using the ABC algorithm, this method seems promising for other applications of text classification.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.