A Novel Multiway Splits Decision Tree for Multiple Types of Data

Zhenyu Liu,Wei Sun,Qilong Zhang,Tao Wen,Isabel S Jesus

doi:10.1155/2020/7870534

Abstract

Classical decision trees such as C4.5 and CART partition the feature space using axis-parallel splits. Oblique decision trees use the oblique splits based on linear combinations of features to potentially simplify the boundary structure. Although oblique decision trees have higher generalization accuracy, most oblique split methods are not directly conducive to the categorical data and are computationally expensive. In this paper, we propose a multiway splits decision tree (MSDT) algorithm, which adopts feature weighting and clustering. This method can combine multiple numerical features, multiple categorical features, or multiple mixed features. Experimental results show that MSDT has excellent performance for multiple types of data.

Highlights

Despite the great success of deep neural network (DNN) model in image processing, speech recognition, and other fields in recent years, decision trees have competitive performance compared to DNN scheme, such as the advantage of interpretability, less parameters, and good robustness to noise, and can be applied to large-scale data sets with less computational cost. erefore, the decision tree is still one of the hotspots in the field of machine learning today [1,2,3]. e research mainly focused on the construction method of decision trees, split criterion [4], decision trees ensemble [5, 6], mixing with other learners [7,8,9], decision trees for semisupervised learning [10], and so on
Due to the time complexity, the most popular algorithms, such as ID3 [15], C4.5 [16], and CART [17], and their various modifications [18] are greedy by nature and construct the decision tree in a top-down, recursive manner
The binary tree can be directly used for multiclassification problems, some binary splits rely on class label, such as FDA, original SVM, etc., which makes some algorithms like Fisher’s decision tree (FDT) in [22] limited to binary classification problems

Summary

Introduction

Despite the great success of deep neural network (DNN) model in image processing, speech recognition, and other fields in recent years, decision trees have competitive performance compared to DNN scheme, such as the advantage of interpretability, less parameters, and good robustness to noise, and can be applied to large-scale data sets with less computational cost. erefore, the decision tree is still one of the hotspots in the field of machine learning today [1,2,3]. e research mainly focused on the construction method of decision trees, split criterion [4], decision trees ensemble [5, 6], mixing with other learners [7,8,9], decision trees for semisupervised learning [10], and so on. Due to the time complexity, the most popular algorithms, such as ID3 [15], C4.5 [16], and CART [17], and their various modifications [18] are greedy by nature and construct the decision tree in a top-down, recursive manner They only act on one dimension at a time and result in an axis-parallel split. It is much more difficult to search the optimal oblique hyperplanes than the optimal axis-parallel hyperplanes To solve this problem, numerous techniques have been applied, for example, hill-climbing [17], simulated annealing [19], and genetic algorithm [20]. 2. Preliminaries e proposed decision tree method needs to weight the features by RELIEF-F algorithm and split the nodes by the weighted k-means algorithm. Where dis_n(xi, μj) represents the distance on the numerical variables and dis_c(xi, μj) represents the distance on the categorical variables, respectively. c is used to adjust the proportion of dis_n(xi, μj) and dis_c(xi, μj), c ∈ [0, 1]

Our Proposed Algorithm

Figure 2

Table 2

Findings

Experiments

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Problems in Engineering	Publication Date: Nov 12, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Novel Multiway Splits Decision Tree for Multiple Types of Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering

Lead the way for us

Similar Papers

Symbolic Regression Enhanced Decision Trees for Classification Tasks
Kei Sen Fong ... Mehul Motani
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38
Kei Sen Fong, et. al.Kei Sen Fong ... Mehul Motani
24 Mar 2024
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38

Latent Attribute Space Tree Classifiers
Ping He ... Xiao-Hua Xu
Journal of Software | VOL. 20
Ping He, et. al.Ping He ... Xiao-Hua Xu
10 Mar 2010
Journal of Software | VOL. 20

A bottom-up oblique decision tree induction algorithm
Rodrigo C Barros ... Pablo A Jaskowiak
-
Rodrigo C Barros, et. al.Rodrigo C Barros ... Pablo A Jaskowiak
01 Nov 2011
01 Nov 2011

Oblique Decision Tree Learning Approaches - A Critical Review
Setu Chaturvedi ... Sonal Patil
International Journal of Computer Applications | VOL. 82
Setu Chaturvedi, et. al.Setu Chaturvedi ... Sonal Patil
15 Nov 2013
International Journal of Computer Applications | VOL. 82

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Multiway Splits Decision Tree for Multiple Types of Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering