Abstract

We introduce a conceptually simple yet effective method to create small, compact decision trees - by using splits found via Symbolic Regression (SR). Traditional decision tree (DT) algorithms partition a dataset on axis-parallel splits. When the true boundaries are not along the feature axes, DT is likely to have a complicated structure and a dense decision boundary. In this paper, we introduce SR-Enhanced DT (SREDT) - a method which utilizes SR to increase the richness of the class of possible DT splits. We evaluate SREDT on both synthetic and real-world datasets. Despite its simplicity, our method produces surprisingly small trees that outperform both DT and oblique DT (ODT) on supervised classification tasks in terms of accuracy and F-score. We show empirically that SREDTs decrease inference time (compared to DT and ODT) and argue that they allow us to obtain more explainable descriptions of the decision process. SREDT also performs competitively against state-of-the-art tabular classification methods, including tree ensembles and deep models. Finally, we introduce a local search mechanism to improve SREDT and evaluate it on 56 PMLB datasets. This mechanism shows improved performance on 77.2% of the datasets, outperforming DT and ODT. In terms of F-Score, local SREDT outperforms DT and ODT in 82.5% and 73.7% of the datasets respectively and in terms of inference time, local SREDT requires 25.8% and 26.6% less inference time than DT and ODT respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call