Abstract

Oblique decision trees recursively divide the feature space by using splits based on linear combinations of attributes. Compared to their univariate counterparts, which only use a single attribute per split, they are often smaller and more accurate. A common approach to learn decision trees is by iteratively introducing splits on a training set in a top–down manner, yet determining a single optimal oblique split is in general computationally intractable. Therefore, one has to rely on heuristics to find near-optimal splits. In this paper, we adapt the cross-entropy optimization method to tackle this problem. The approach is motivated geometrically by the observation that equivalent oblique splits can be interpreted as connected regions on a unit hypersphere which are defined by the samples in the training data. In each iteration, the algorithm samples multiple candidate solutions from this hypersphere using the von Mises–Fisher distribution which is parameterized by a mean direction and a concentration parameter. These parameters are then updated based on the best performing samples such that when the algorithm terminates a high probability mass is assigned to a region of near-optimal solutions. Our experimental results show that the proposed method is well-suited for the induction of compact and accurate oblique decision trees in a small amount of time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call