Abstract

Decision tree (DT) plays an important role in pattern recognition and machine learning, which is widely used for regression tasks because of its natural interpretability. Nevertheless, the traditional decision tree is constructed by recursive Boolean division. The discrete decision-making process in DT makes it non-differentiable, and causes the problem of hard decision boundary. To solve this problem, a probability distribution model — Staired-Sigmoid is proposed in this paper. The Staired-Sigmoid model is used to differentiate the decision-making process, by which the samples can be assigned to two sub-trees more finely. Based on Staired-Sigmoid, we further propose the soft decision tree (SDT) for regression tasks, where the samples are assigned to different sub-nodes according to a continuous probability distribution. This process is differentiable, and all parameters in SDT can be optimized by gradient descent algorithms. Owing to its constructing rules, SDT is more stable than decision tree, and it is easier to overcome the problem of overfitting. We validate SDT on several datasets obtained from UCI. Experiments demonstrate that SDT achieves better performance than decision tree, and it significantly alleviates the overfitting.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call