An improved decision tree algorithm based on variable precision neighborhood similarity

Caihui Liu,Bowen Lin,Jianying Lai,Duoqian Miao

doi:10.1016/j.ins.2022.10.043

Abstract

The decision tree algorithm has been widely used in data mining and machine learning due to its high accuracy, low computational cost and high interpretability. However, when dealing with the continuous data, the classical decision tree algorithm needs to replace continuous attributes with discretized attributes by the strategy of discretization. Discretization may cause a loss of information structure, which will affect the performance of classification. To tackle this problem, many researchers have proposed different decision tree methods based on variable precision neighborhood rough sets. However, these methods do not consider the geometric structure of neighborhood systems, which may lead to a contradiction in the transitivity of the equivalence relation. In this paper, we first define a novel neighborhood geometric similarity in a neighborhood system from the perspective of geometry. Second, by combining the neighborhood geometric similarity and the neighborhood algebraic similarity, we propose four new kinds of neighborhood similarities, which can solve the contradictory transitivity of the equivalence relation. Third, a variable precision neighborhood rough set model is constructed using the new similarities, and a novel decision tree algorithm is proposed based on this model, where the degree of attribute dependence is used as the partition measure. Experimental results on 14 selected datasets from the UCI Machine Learning Repository show that our algorithm is effective. The average accuracy of our algorithm is over 90%, which is 10% higher than the classical decision tree algorithms, and the number of leaf nodes increases slightly.

Full Text