Abstract

Uncertainty evaluation based on statistical probabilistic information entropy is a commonly used mechanism for a heuristic method construction of decision tree learning. The entropy kernel potentially links its deviation and decision tree classification performance. This paper presents a decision tree learning algorithm based on constrained gain and depth induction optimization. Firstly, the calculation and analysis of single- and multi-value event uncertainty distributions of information entropy is followed by an enhanced property of single-value event entropy kernel and multi-value event entropy peaks as well as a reciprocal relationship between peak location and the number of possible events. Secondly, this study proposed an estimated method for information entropy whose entropy kernel is replaced with a peak-shift sine function to establish a decision tree learning (CGDT) algorithm on the basis of constraint gain. Finally, by combining branch convergence and fan-out indices under an inductive depth of a decision tree, we built a constraint gained and depth inductive improved decision tree (CGDIDT) learning algorithm. Results show the benefits of the CGDT and CGDIDT algorithms.

Highlights

  • Decision trees are used extensively in data modelling of a system and rapid real-time prediction for real complex environments [1,2,3,4,5]

  • The attribute selections in constructing a decision tree are mostly based on the uncertainty heuristic method, which can be divided into the following categories: Information entropy method based on statistical probability [11,12,13,14], based on a rough set and its information entropy method [15,16,17], and the uncertainty approximate calculation method [18,19]

  • This study proposed an improved learning algorithm based on constraint gain and depth induction for a decision tree

Read more

Summary

Introduction

Decision trees are used extensively in data modelling of a system and rapid real-time prediction for real complex environments [1,2,3,4,5]. Given a dataset acquired by field sampling, a decision attribute is determined through a heuristic method [6,7] for training a decision tree. The attribute selections in constructing a decision tree are mostly based on the uncertainty heuristic method, which can be divided into the following categories: Information entropy method based on statistical probability [11,12,13,14], based on a rough set and its information entropy method [15,16,17], and the uncertainty approximate calculation method [18,19]. An uncertainty evaluation of Shannon information entropy [20] based on statistical probability has been used previously for uncertainty evaluation of the sample set division of decision tree training [21], such as the well-known ID3 and

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call