Induction of decision trees as classification models through metaheuristics

Rafael Rivera-Lopez,Marco Antonio Cruz-Chávez,Juana Canul-Reich,Efrén Mezura-Montes

doi:10.1016/j.swevo.2021.101006

Abstract

• The three types of metaheuristic and the hyperheuristic-based methods to build decision tree induction algorithms are included, not just evolutionary-algorithm-based approaches for decision tree induction. • Three types of implementation strategies are introduced, according to the place where a metaheuristict is used in the decision tree induction process. • The differences in the solution representation are highlighted, as they impact the metaheuristic-based implementation. • The principal components of each method are detailed, such as fitness measures and variation operators. • An analysis of the experimental studies performed in these methods is conducted. The induction of decision trees is a widely-used approach to build classification models that guarantee high performance and expressiveness. Since a recursive-partitioning strategy guided for some splitting criterion is commonly used to induce these classifiers, overfitting, attribute selection bias, and instability to small training set changes are well-known problems in them. Other approaches, such as incremental induction, classifier ensembles, and the global search in the decision-tree-space, have been implemented to overcome these problems. In particular, metaheuristics such as simulated annealing, genetic algorithms, genetic programming, and ant colony optimization have been used to induce compact and accurate decision trees. This paper presents a state-of-the-art review of the use of single-solution-based metaheuristics and swarm and evolutionary computation algorithms to build decision trees as classification models. We outline the decision-tree-induction process components and detail the existing literature studies on metaheuristic-based approaches to building these classifiers. Several timelines showing the chronological order in which these approaches were introduced in the literature are included. A summary analysis of these studies is also conducted, focusing on their internal components and experimental studies. This work provides a useful reference point for future research in this field.

Full Text