Abstract

The original random forests (RFs) algorithm has been widely used and has achieved excellent performance for the classification and regression tasks. However, the research on the theory of RFs lags far behind its applications. In this article, to narrow the gap between the applications and the theory of RFs, we propose a new RFs algorithm, called random Shapley forests (RSFs), based on the Shapley value. The Shapley value is one of the well-known solutions in the cooperative game, which can fairly assess the power of each player in a game. In the construction of RSFs, RSFs use the Shapley value to evaluate the importance of each feature at each tree node by computing the dependency among the possible feature coalitions. In particular, inspired by the existing consistency theory, we have proved the consistency of the proposed RFs algorithm. Moreover, to verify the effectiveness of the proposed algorithm, experiments on eight UCI benchmark datasets and four real-world datasets have been conducted. The results show that RSFs perform better than or at least comparable with the existing consistent RFs, the original RFs, and a classic classifier, support vector machines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call