Abstract

A new splitting criterion for classification trees that generates better decision rules in terms of interpretability is proposed in this paper. The criterion is designed to find homogeneous rules that describe a significant number of instances with a short length. The proposed criterion considers only one side of a split to generate highly homogeneous rules and concurrently utilizes a function of sample ratios with an adjustable hyperparameter to control the coverage of rules. The distinctive feature of the proposed method is that it is applied adaptively at every split. We also introduce an efficient heuristic algorithm to determine an appropriate hyperparameter value for every split. Experimental results evaluated over 17 benchmark datasets show that the proposed criterion combined with the proposed heuristic constructs a better interpretable decision tree. It is verified through quantitative and qualitative analysis that the constructed tree produces highly interpretable rules, and its predictive performance is comparable to that of other popular criteria.

Highlights

  • A decision tree is one of the most popular machine learning algorithms for supervised learning problems

  • EXPERIMENTAL RESULTS we present extensive experimental results showing the competitiveness of the proposed splitting criterion compared to other competing criteria: entropy, the gain ratio (GR), the Gini Index, Tsallis entropy [11], [12] and the Tsallis gain ratio (Tsallis GR) [11], [12]

  • In this paper, we address the interpretability of decision trees, which can be evaluated in terms of homogeneity, sample coverage, and the length of the generated rules

Read more

Summary

Introduction

A decision tree is one of the most popular machine learning algorithms for supervised learning problems It has been widely used and is still being actively researched, since it shows relatively good predictive performance and provides decision rules that are easy to understand and interpret [1]–[5]. The split point with the highest score for the splitting criterion is selected for a split, and this procedure is applied recursively until certain stopping conditions, such as a maximum tree depth or minimum leaf size, are met. Through this process, decision rules, consisting of a subset of attributes and the corresponding split values, can be obtained. It is well known that decision trees are superior to other supervised learning algorithms in terms of model interpretability [6], [7]

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.