On the automatic design of decision-tree induction algorithms

Rodrigo Coelho Barros

doi:10.11606/t.55.2013.tde-21032014-144814

Abstract

DEcision-tree induction is one of the most employed methods to extract knowledge from data. There are several distinct strategies for inducing decision trees from data, each one presenting advantages and disadvantages according to its corresponding inductive bias. These strategies have been continuously improved by researchers over the last 40 years. This thesis, following recent breakthroughs in the automatic design of machine learning algorithms, proposes to automatically generate decision-tree induction algorithms. Our proposed approach, namely HEAD-DT, is based on the evolutionary algorithms paradigm, which improves solutions based on metaphors of biological processes. HEAD-DT works over several manually-designed decision-tree components and combines the most suitable components for the task at hand. It can operate according to two different frameworks: i) evolving algorithms tailored to one single data set (specific framework); and ii) evolving algorithms from multiple data sets (general framework). The specific framework aims at generating one decision-tree algorithm per data set, so the resulting algorithm does not need to generalise beyond its target data set. The general framework has a more ambitious goal, which is to generate a single decision-tree algorithm capable of being effectively applied to several data sets. The specific framework is tested over 20 UCI data sets, and results show that HEAD-DT’s specific algorithms outperform algorithms like CART and C4.5 with statistical significance. The general framework, in turn, is executed under two different scenarios: i) designing a domain-specific algorithm; and ii) designing a robust domain-free algorithm. The first scenario is tested over 35 microarray gene expression data sets, and results show that HEAD-DT’s algorithms consistently outperform C4.5 and CART in different experimental configurations. The second scenario is tested over 67 UCI data sets, and HEAD-DT’s algorithms were shown to be competitive with C4.5 and CART. Nevertheless, we show that HEAD-DT is prone to a special case of overfitting when it is executed under the second scenario of the general framework, and we point to possible alternatives for solving this problem. Finally, we perform an extensive experiment for evaluating the best single-objective fitness function for HEAD-DT, combining 5 classification performance measures with three aggregation schemes. We evaluate the 15 fitness functions in 67 UCI data sets, and the best of them are employed to generate algorithms tailored to balanced and imbalanced data. Results show that the automatically-designed algorithms outperform CART and C4.5 with statistical significance, indicating that HEAD-DT is also capable of generating custom algorithms for data with a particular kind of statistical profile.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On the automatic design of decision-tree induction algorithms

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2014
Citations: 4	License type: cc-by-nc-sa

Similar Papers

Evolutionary Design of Decision-Tree Algorithms Tailored to Microarray Gene Expression Data Sets
Rodrigo C Barros ... Marcio P Basgalupp
IEEE Transactions on Evolutionary Computation | VOL. 18
Rodrigo C Barros, et. al.Rodrigo C Barros ... Marcio P Basgalupp
01 Dec 2014
IEEE Transactions on Evolutionary Computation | VOL. 18

HEAD-DT: Experimental Analysis
Rodrigo C Barros ... Alex A Freitas
-
Rodrigo C Barros, et. al.Rodrigo C Barros ... Alex A Freitas
01 Jan 2015
01 Jan 2015

HEAD-DT: Automatic Design of Decision-Tree Algorithms
Rodrigo C Barros ... Alex A Freitas
-
Rodrigo C Barros, et. al.Rodrigo C Barros ... Alex A Freitas
01 Jan 2015
01 Jan 2015

Automatic Design of Decision-Tree Algorithms with Evolutionary Algorithms
Rodrigo C Barros ... Alex A Freitas
Evolutionary Computation | VOL. 21
Rodrigo C Barros, et. al.Rodrigo C Barros ... Alex A Freitas
08 Aug 2013
Evolutionary Computation | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the automatic design of decision-tree induction algorithms

Abstract

Talk to us

Similar Papers