Adaptive in-memory representation of decision trees for GPU-accelerated evolutionary induction

Krzysztof Jurczuk,Marcin Czajkowski,Marek Kretowski

doi:10.1016/j.future.2023.12.003

Abstract

Decision trees (DTs) are a type of machine learning technique used for classification and regression problems. They are considered to be a part of explainable artificial intelligence due to their transparent and interpretable nature. The traditional approach in building DTs involves a top-down greedy search, which is usually fast but may result in a sub-optimal solution. An alternative method is evolutionary induction. It allows for more global exploration and can lead to simpler DTs, but it is computationally more expensive. To speed up the induction, different parallel approaches have been considered, with the GPU-based ones being the most efficient.In this paper, we extend the GPU-supported algorithm of DT evolutionary induction to use less memory and time resources. An improved in-memory representation of DTs is introduced, which adapts depending on the tree structure. Two types are possible: compact and complete representations. A compact one is used for DTs that are sparsely filled. It contains only existing tree nodes and therefore the links between parent and child nodes must be explicitly stored (along with the data). On the other hand, a complete representation is applied when the tree levels are densely filled. Then, placeholders for all potential nodes are reserved and the parent position can be easily calculated based on the child’s place (and vice versa). The algorithm switches between the representations based on a density factor of tree filling.To validate our concept, we conducted experiments on both real-life and artificial datasets with varying sizes and dimensions. The results show that the DT in-memory representation affects the time and memory requirements a lot. The most efficient is the adaptive one. It not only speeds up the induction time but also reduces memory requirements (in terms of both transfer and occupancy). As a result, the evolutionary induction of DTs is closer to being competitive with state-of-the-art greedy inducers in terms of computation time.

Full Text