A Review and Experimental Comparison of Multivariate Decision Trees

Leonardo Canete-Sifuentes,Raul Monroy,Miguel Angel Medina-Perez

doi:10.1109/access.2021.3102239

Abstract

Decision trees are popular as stand-alone classifiers or as base learners in ensemble classifiers. Mostly, this is due to decision trees having the advantage of being easy to explain. To improve the classification performance of decision trees, some authors have used Multivariate Decision Trees (MDTs), which allow combinations of features when splitting a node. While there is growing interest in the area, recent research in MDTs all have in common that they do not provide adequate comparison of related work: they do not consider relevant rival techniques, or they test algorithm performance in an insufficient number of databases. As a result, claims have no statistical sustain and, hence, there is a lack of general understanding of the actual capabilities of existing MDT induction algorithms, crucial to improving the state-of-the-art. In this paper, we report on an exhaustive review of MDTs. In particular, we give an overview of 37 MDT induction algorithms, out of which we have experimentally compared 19 of them in 57 databases. We provide a statistical comparison in all databases and subsets of databases according to the number of classes, number of features, number of instances, and degree of class imbalance. This allows us to identify groups of top-performing algorithms for different types of databases.

Highlights

Decision trees (DTs) are popular classifiers, partly because their models are easy to explain and because they show remarkable performance
There is not any comprehensive comparison to determine the relative performance of existing Multivariate Decision Trees (MDTs), let alone identifying the top ones. This is both because there are no surveys about MDTs, and because recent papers introducing MDTS suffer from one or two main shortcomings, in terms of the comparison of previous work: authors do not compare their algorithm with relevant rival techniques, or they do so but not in enough databases, and results are insufficient for statistically validating the underlying hypothesis
Our goal with this paper is to fill in this gap; that is, we aim to evaluate the relative merit of MDT induction algorithms to identify how they compare one another

Summary

Introduction

Decision trees (DTs) are popular classifiers, partly because their models are easy to explain and because they show remarkable performance. Decision tree performance is highly competitive through the use of ensembles; in a recent survey [2], Random Forest [3] and eXtreme Gradient Boosting (XGBoost) [4] are among the top-ranked algorithms. Each branch is tagged with a test, which evaluates to true or false for each object. For branches coming out of the same node, the tests define a partition of the database; so, for each object, one and only one of the tests evaluate to true. The tuple of tests tagging branches from a node is known as a split because they are used to split the objects in a node into disjoint subsets during tree construction. We use split as a verb; to split a node is to select a split and generate the corresponding children nodes

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 18	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Review and Experimental Comparison of Multivariate Decision Trees

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

A best-first multivariate decision tree method used for urban land cover classification
Wen-Ting Cai ... Yu Zhang
-
Wen-Ting Cai, et. al.Wen-Ting Cai ... Yu Zhang
01 Jun 2010
01 Jun 2010

Oblique and rotation double random forest
M.A Ganaie ... V Snasel
Neural Networks | VOL. 153
M.A Ganaie, et. al.M.A Ganaie ... V Snasel
18 Jun 2022
Neural Networks | VOL. 153

Multivariate Decision Trees
Carla E Brodley ... Paul E Utgoff
Machine Learning | VOL. 19
Carla E Brodley, et. al.Carla E Brodley ... Paul E Utgoff
01 Jan 1995
Machine Learning | VOL. 19

Random Projection Random Discretization Ensembles—Ensembles of Linear Multivariate Decision Trees
Amir Ahmad ... Gavin Brown
IEEE Transactions on Knowledge and Data Engineering | VOL. 26
Amir Ahmad, et. al.Amir Ahmad ... Gavin Brown
01 May 2014
IEEE Transactions on Knowledge and Data Engineering | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Review and Experimental Comparison of Multivariate Decision Trees

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access