Bio-inspired feature selection algorithms got the attention of the researchers in the domain of Software Development Effort Estimations (SDEE) because they can improve the prediction accuracy of existing estimation techniques, such as machine learning methods. This paper aims to analyze different feature selection algorithms and assess the role they can play to increase the accuracy of software development effort predictions. We have performed an empirical study considering commonly used bio-inspired feature selection algorithms in the domain of SDEE, i.e., Genetic Algorithm (GA), Particle Swarm Optimization, Ant Colony Optimization, Tabu Search, Harmony Search (HS), and Firefly algorithm, and four traditional non-bio-inspired algorithms, i.e., Best-First Search (BFS), Greedy Stepwise, Subset Forward Selection, and Random Search, used in combination with five widely used estimation techniques and applied to eight widely used SDEE datasets. The performed analysis suggests that almost all (bio-inspired) feature selection algorithms have outperformed the baseline estimation techniques (i.e., techniques employed without any feature selection algorithms) in the majority of the experiments and hence we can conclude that feature selection algorithms can help in the domain of SDEE to increase the prediction accuracy. Similarly, HS and GA are considered as best performed bio-inspired algorithms because they provided significantly better results than the non-bio-inspired algorithms in a greater number of experiments. Moreover, we also compared the results of various employed bio-inspired algorithms, and, again, GA and HS came out as the best performed bio-inspired feature selection algorithms. From our results, if we have to pick feature selection algorithms (from both bio- and non-bio-inspired) and recommend them for future investigations, we would suggest HS because it provided better effort predictions in more combinations of datasets and estimation techniques than the other considered bio- and non-bio-inspired algorithms. Among the non-bio-inspired algorithms, BFS is the one that provided better predictions. • Feature selection algorithms should be used to build effort estimation models. • Bio-inspired algorithms resulted better than non bio-inspired algorithms • Harmony Search and Genetic Algorithm performed better than other algorithms.
Read full abstract