Abstract
The vast majority of the digital era data is stored as text. Text mining is an integral part of data mining. Text classification (TC) is a natural language processing (NLP) operation often needed in text mining. This operation is needed in numerous kinds of research such as information retrieval, document classification, language detection, sentiment analysis, etc. According to the literature, the filter feature selection methods have often been applied to reduce the dimensionality of data in Turkish TC. However, the wrapper-based feature selection methods can provide better classification accuracies than the filter methods. Motivated by this idea, a Turkish TC method based on wrapper feature selection using particle swarm optimization algorithm (PSO) and multinomial naive bayes (MNB) classifier is proposed in this study. TTC-3600 Turkish news texts are used for TC in the experiments. The proposed method achieves a classification accuracy of 94.55% on TTC-3600 Turkish news text dataset by using stemming Tf-Idf features. Hence, it produces competitive accuracies to the cutting-edge Turkish TC methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Afyon Kocatepe University Journal of Sciences and Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.