Non-Greedy Algorithms for Decision Tree Optimization: An Experimental Comparison

Suryabhan Singh Hada,Miguel A. Carreira-Perpinan,Arman Zharmagambetov,Magzhan Gabidolla

doi:10.1109/ijcnn52387.2021.9533597

Abstract

Learning decision trees is a difficult optimization problem: nonconvex, nondifferentiable and over a huge number of tree structures. The dominant paradigm in practice, established in the 1980s, are axis-aligned trees trained with a greedy recursive partitioning algorithm such as CART or C5.0. Several non-greedy optimization algorithms have been proposed recently, which optimize all the nodes' parameters jointly, and we compare experimentally some of them in a range of classification and regression datasets, in terms of accuracy, training time and tree size. The non-greedy algorithms do not improve over CART significantly with one exception, tree alternating optimization (TAO). TAO scales to large datasets and produces axis-aligned and especially oblique trees that consistently outperform all other algorithms, often by a large margin. TAO makes oblique trees preferable to axis-aligned ones in many cases, since they are much more accurate while remaining small and interpretable. This suggests a change in paradigm in practical applications of decision trees.

Full Text