Abstract
Due to the large number of documents available in the internet, emails and digital libraries, document classification is becoming a crucial task extremely required. It is commonly achieved after performing feature selection that consists of selecting appropriate features to enhance the classification accuracy. Most of feature selection based text classification methods rely on building a term-frequency inverse-document frequency feature vector which is not usually efficient. In addition, numerous document classification studies are focused on English language. This paper deals with Arabic Text Classification which is not intensively studied due to the complexity of Arabic language. A new firefly algorithm based feature selection method is proposed. This algorithm has been successfully applied in different combinatorial problems. However, it has not been involved in feature selection concept to deal with Arabic Text Classification. To validate this technique, Support Vector Machine classifier is used as well as three evaluation measures including precision, recall and F-measure. Furthermore, experiments on OSAC real dataset along with a comparison with the state-of-the-art methods are performed. The proposed method achieves a precision value equals to 0.994. The results confirm the efficiency of the proposed feature selection method in improving Arabic Text Classification accuracy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of King Saud University - Computer and Information Sciences
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.