Feature Selection using Stochastic Diffusion Search Algorithm in Big Data Analysis

doi:10.35940/ijrte.d1051.1284s519

Abstract

Big Data analysis has been viewed as the processing or mining of massive amounts of data used to retrieve information which is useful from large datasets. Among all the methods employed to deal with the analysis of Big Data, the selection of a feature is found extremely effective. A common approach which includes search making use of feature-based subsets which is relevant to the topic, tends to represent the dataset with its actual description. However, a search that makes use of such a subset is a combinatorial problem which is time-consuming. All commonly used meta-heuristic algorithms to facilitate feature choice. The Stochastic Diffusion Search (SDS) based algorithm has been a multi-agent global search algorithm based on agent interaction is simple to overcome combinatorial problems. The SDS will choose the feature subset for the task of classification. The Classification and Regression Tree (CART), the Naïve Bayes (NB), the Support Vector Machine (SVM) and the K-Nearest Neighbour (KNN) have been used to improve the performance. Results proved that the proposed method was able to achieve a better performance than existing techniques.

Full Text