Development of a Java Library with Bacterial Foraging Optimization for Feature Selection of High-Dimensional Data

Tessy Badriyah,Iwan Syarif,Fitriani Rohmah Hardiyanti

doi:10.62527/joiv.8.1.2149

Abstract

High-dimensional data allows researchers to conduct comprehensive analyses. However, such data often exhibits characteristics like small sample sizes, class imbalance, and high complexity, posing challenges for classification. One approach employed to tackle high-dimensional data is feature selection. This study uses the Bacterial Foraging Optimization (BFO) algorithm for feature selection. A dedicated BFO Java library is developed to extend the capabilities of WEKA for feature selection purposes. Experimental results confirm the successful integration of BFO. The outcomes of BFO's feature selection are then compared against those of other evolutionary algorithms, namely Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Artificial Bee Colony (ABC), and Ant Colony Optimization (ACO). Comparison of algorithms conducted using the same datasets. The experimental results indicate that BFO effectively reduces features while maintaining consistent accuracy. In 4 out of 9 datasets, BFO outperforms other algorithms, showcasing superior processing time performance in 6 datasets. BFO is a favorable choice for selecting features in high-dimensional datasets, providing consistent accuracy and effective processing. The optimal fraction of features in the Ovarian Cancer dataset signifies that the dataset retains a minimal number of selected attributes. Consequently, the learning process gains speed due to the reduced feature set. Remarkably, accuracy substantially increased, rising from 0.868 before feature selection to 0.886 after feature selection. The classification processing time has also been significantly shortened, completing the task in just 0.3 seconds, marking a remarkable improvement from the previous 56.8 seconds.

Full Text