Abstract

Decision trees (DTs) are popular techniques in the field of explainable machine learning. Traditionally, DTs are induced using a top-down greedy search that is usually fast; however, it may lead to sub-optimal solutions. Here, we deal with an alternative approach which is an evolutionary induction. It provides global exploration that results in less complex DTs but it is much more time-demanding. Various parallel computing approaches were considered, where GPU-based one seems to be the most efficient. To speed up the induction further, different GPU memory organization/layouts could be dealt with. In this paper, we introduce a compact in-memory representation of DTs. It is a one-dimensional array representation where links between parent and children tree nodes are explicitly stored next to the node data (testes in internal nodes, classes in leaves, etc.). On the other side, when the complete representation is applied, children positions are calculated based on the parent place. However, it needs a spacious one-dimensional array as if all DT levels would be completely filled, no matter if all nodes actually exist. Experimental validation is performed on real-life and artificial datasets with various sizes and dimensions. Results show that by using the compact representation not only the memory requirements are reduced but also the time of induction is decreased.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call