Memory optimisation for hardware induction of axis-parallel decision tree

Chuan Cheng,Christos-Savvas Bouganis

doi:10.1109/reconfig.2014.7032538

Abstract

In data mining and machine learning applications, the Decision Tree classifier is widely used as a supervised learning method not only in the form of a stand alone model but also as a part of an ensemble learning technique (i.e. Random Forest). The induction of Decision Trees (i.e. training stage) involves intense memory communication and inherent parallel processing, making an FPGA device a promising platform for accelerating the training process due to high memory bandwidth enabled by the embedded memory blocks in the device. However, peak memory bandwidth is reached when all the channels of the block RAMs on the FPGA are free for concurrent communication, whereas to accommodate large data sets several block RAMs are often combined together making unavailable a number of memory channels. Therefore, efficient use of the embedded memory is critical not only for allowing larger training dataset to be processed on an FPGA but also for making available as many memory channels as possible to the rest of the system. In this work, a data compression scheme is proposed for the training data stored in the embedded memory for improving the memory utilisation of the device, targeting specifically the axis-parallel decision tree classifier. The proposed scheme takes advantage of the nature of the problem of the decision tree induction and improves the memory efficiency of the system without any compromise on the performance of the classifier. It is demonstrated that the scheme can reduce the memory usage by up to 66% for the training datasets under investigation without compromise in training accuracy, while a 28% reduction in training time is achieved due to extra processing power enabled by the additional memory bandwidth.

Full Text