Abstract
This is Big Data decade with extensive increase in the textual information where the text classification is the significant approach for processing and organizing textual information. Text categorization refers to the process of spontaneously allotting documents to the relevant classes. The key features of these text classification issue is tremendous increase in higher dimensionality of text information. Meta-Heuristics Approaches are effortlessly employed to obtain optimal solutions for high dimensional datasets in text categorization. However, some of these approaches like genetic algorithm and particle swarm optimization gives a sub-optimal solutions, the convergence time is more compared to other approaches and cannot guarantee the global maxima to the text categorization. Thus, in this paper, a nature-inspired optimization approach depending on catching mechanism of antlions in the environment known as Ant Lion Optimizer (ALO) Approach, is applied to resolve higher dimensionality issues prior to text classification. The precision and recall values for the proposed is comparatively effective when compared with the existing text categorization dimensionality reduction techniques.
Highlights
The rapid increase in the internet usage and availability of on-line documents has made the job of processing textual information as one of the key issue a days
The Conventional approaches for selection of feature subset in text categorization employs an estimation function that can be smeared on every unique terms
The experimental results for suggested Antlion Optimization based Text Categorization is carried out using two different data samples. i) Reuters-21578: This data set4 comprises of 21578 articles obtained from Reuter’s newswire and was downloaded from the web site, https://archive.ics.uci.edu/ml/datasets/Reuters-21578+Text+Categorization+Collection
Summary
The rapid increase in the internet usage and availability of on-line documents has made the job of processing textual information as one of the key issue a days. A dominant issue in numerical textual categorization [2] is the higher dimensionality in the feature domain. The selection of certain illustrative characteristics from original feature domain is required to diminish the size of feature set and to enhance the efficacy and accuracy of categorizers. The Conventional approaches for selection of feature subset in text categorization employs an estimation function that can be smeared on every unique terms. Genetic algorithm and particle swarm optimization algorithm are most well-known methods This is a recent meta-heuristic that mathematically models the interaction of ants and ant lions in nature
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have