Abstract

Bayesian Decision Trees provide a probabilistic framework that reduces the instability of Decision Trees while maintaining their explainability. While Markov Chain Monte Carlo methods are typically used to construct Bayesian Decision Trees, here we provide a deterministic Bayesian Decision Tree algorithm that eliminates the sampling and does not require a pruning step. This algorithm generates the greedy-modal tree (GMT) which is applicable to both regression and classification problems. We tested the algorithm on various benchmark classification data sets and obtained similar accuracies to other known techniques. Furthermore, we show that we can statistically analyze how was the GMT derived from the data and demonstrate this analysis with a financial example. Notably, the GMT allows for a technique that provides explainable simpler models which is often a prerequisite for applications in finance or the medical industry.

Highlights

  • The success of machine learning techniques applied to financial and medical problems can be encumbered by the inherent noise in the data

  • When the noise is not properly considered, there is a risk to overfit the data generating unnecessarily complex models that may lead to incorrect interpretations

  • There has been lot of efforts aimed at increasing model interpretability in machine learning applications [1,2,3,4,5]

Read more

Summary

INTRODUCTION

The success of machine learning techniques applied to financial and medical problems can be encumbered by the inherent noise in the data. While most of the Bayesian work is based on Markov Chain convergence, here we take a deterministic approach that: 1) considers the noise in the data, 2) generates less complex models measured in terms of the number of nodes, and 3) provides a statistical framework to understand how the model is constructed. The proposed algorithm departs from [13], introduces the trivial partition to avoid the pruning step, and generalizes the approach to employ any conjugate prior. Given the input data and model parameters the resulting tree is deterministic Since it is deterministic, one can analyze the statistical reasons behind the choice of each node.

BAYESIAN DECISION TREES OVERVIEW
FROM THE PARTITION PROBABILITY SPACE TO BAYESIAN DECISION TREES
BENCHMARK
Bayesian Decision Trees and GMT
TRADING EXAMPLE
Findings
DISCUSSION AND FUTURE
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.