Algebraic aggregation of random forests: towards explainability and rapid evaluation

Frederik Gossen,Bernhard Steffen

doi:10.1007/s10009-021-00635-x

Frederik Gossen, Bernhard Steffen

Open Access

https://doi.org/10.1007/s10009-021-00635-x

Copy DOI

Abstract

Random Forests are one of the most popular classifiers in machine learning. The larger they are, the more precise the outcome of their predictions. However, this comes at a cost: it is increasingly difficult to understand why a Random Forest made a specific choice, and its running time for classification grows linearly with the size (number of trees). In this paper, we propose a method to aggregate large Random Forests into a single, semantically equivalent decision diagram which has the following two effects: (1) minimal, sufficient explanations for Random Forest-based classifications can be obtained by means of a simple three step reduction, and (2) the running time is radically improved. In fact, our experiments on various popular datasets show speed-ups of several orders of magnitude, while, at the same time, also significantly reducing the size of the required data structure.

Highlights

Random1 Forests are one of the most widely known classifiers in machine learning [2,19]
We present an optimisation method that is based on algebraic aggregation: Random Forests are transformed into a single decision diagram in a semanticspreserving fashion, which, in particular, preserves the learner’s variance and accuracy
We have presented an approach to aggregate large Random Forests into single and compact decision diagrams that faithfully reflects the semantics of the original Random Forest for a considered purpose

Summary

Introduction

Random Forests are one of the most widely known classifiers in machine learning [2,19]. We present an optimisation method that is based on algebraic aggregation: Random Forests are transformed into a single decision diagram in a semanticspreserving fashion, which, in particular, preserves the learner’s variance and accuracy. The great advantage of the resulting decision diagrams is their absence of redundancy: during classification every predicate is considered at most once, and only if its evaluation is required This allows one to obtain concise explanations and evaluation times that are optimal.. Key to our approach are Algebraic Decision Diagrams (ADDs) [28] Their algebraic structure supports compositional aggregation, abstraction, and reduction operations that lead to minimal normal forms. Using basic algebraic operations, such as concatenation and addition, allows us to aggregate a Random Forest into a single ADD that faithfully maintains the individual results of each tree in the forest. Abstracting results (i.e. the leaf structure of the decision diagrams) to the essence, in this case the

Up to an underlying predicate ordering

Algebraic decision structures

Random forests

The essence of ADDs

Co-domain algebras and their relationships

Correctness and optimality

Infeasible path reduction

Towards explainability

Experimental performance evaluation

10 Related work

11 Conclusions and future work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal on Software Tools for Technology Transfer	Publication Date: Sep 29, 2021
Citations: 11	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Algebraic aggregation of random forests: towards explainability and rapid evaluation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal on Software Tools for Technology Transfer

Lead the way for us

Similar Papers

Spaceborne GNSS-R for Sea Ice Classification Using Machine Learning Classifiers
Yongchao Zhu ... Maximilian Semmling
Remote Sensing | VOL. 13
Yongchao Zhu, et. al.Yongchao Zhu ... Maximilian Semmling
14 Nov 2021
Remote Sensing | VOL. 13

Comparison and Evaluation of Machine Learning-Based Classification of Hand Gestures Captured by Inertial Sensors
Ivo Stančić ... Tamara Grujić
Computation | VOL. 10
Ivo Stančić, et. al.Ivo Stančić ... Tamara Grujić
14 Sep 2022
Computation | VOL. 10

Cystic renal mass screening: machine-learning-based radiomics on unenhanced computed tomography
Lesheng Huang ... Se Peng
Diagnostic and interventional radiology (Ankara, Turkey) | VOL. 30
Lesheng Huang, et. al.Lesheng Huang ... Se Peng
02 Jan 2023
Diagnostic and interventional radiology (Ankara, Turkey) | VOL. 30

A comparative study of supervised Machine Learning classifiers for Intrusion Detection in Internet of Things
Naveen Saran ... Nishtha Kesswani
Procedia Computer Science | VOL. 218
Naveen Saran, et. al.Naveen Saran ... Nishtha Kesswani
01 Jan 2023
Procedia Computer Science | VOL. 218

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Algebraic aggregation of random forests: towards explainability and rapid evaluation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal on Software Tools for Technology Transfer