A Rule Extraction Technique Applied to Ensembles of Neural Networks, Random Forests, and Gradient-Boosted Trees

Guido Bologna

doi:10.3390/a14120339

Abstract

In machine learning, ensembles of models based on Multi-Layer Perceptrons (MLPs) or decision trees are considered successful models. However, explaining their responses is a complex problem that requires the creation of new methods of interpretation. A natural way to explain the classifications of the models is to transform them into propositional rules. In this work, we focus on random forests and gradient-boosted trees. Specifically, these models are converted into an ensemble of interpretable MLPs from which propositional rules are produced. The rule extraction method presented here allows one to precisely locate the discriminating hyperplanes that constitute the antecedents of the rules. In experiments based on eight classification problems, we compared our rule extraction technique to “Skope-Rules” and other state-of-the-art techniques. Experiments were performed with ten-fold cross-validation trials, with propositional rules that were also generated from ensembles of interpretable MLPs. By evaluating the characteristics of the extracted rules in terms of complexity, fidelity, and accuracy, the results obtained showed that our rule extraction technique is competitive. To the best of our knowledge, this is one of the few works showing a rule extraction technique that has been applied to both ensembles of decision trees and neural networks.

Highlights

Deep learning has been very successful in the last decade
Experiments were performed with ten-fold cross-validation trials, with propositional rules that were generated from ensembles of interpretable Multi-Layer Perceptrons (MLPs)
For the “Australian” classification problem, Table 12 illustrates the results provided by several rule extraction techniques applied to ensembles of Decision Trees (DTs) (ET-FBT, Random Forests (RFs)-FBT, AFBT, and InTrees)

Summary

Introduction

Deep learning has been very successful in the last decade. In domains such as computer vision, deep neural network models have improved their performance significantly over that of Multi-Layer Perceptrons (MLPs), Support Vector Machines (SVMs), Decision Trees (DTs), and ensembles. For structured data, deep models do not offer a significant advantage over well-established “classical” models, which remain indispensable. A major problem with MLPs is that it is difficult to interpret their responses; MLPs are very often considered as black boxes. Andrews et al introduced a nomenclature encompassing all methods for explaining neural network responses with symbolic rules [1]. For SVMs that are functionally equivalent to MLPs, rule extraction methods were proposed [2]

Objectives

Methods

Results

Discussion

Conclusion