Feature-Based Interpretation of the Deep Neural Network

Hyeoncheol Kim,Eun-Hun Lee

doi:10.3390/electronics10212687

Abstract

The significant advantage of deep neural networks is that the upper layer can capture the high-level features of data based on the information acquired from the lower layer by stacking layers deeply. Since it is challenging to interpret what knowledge the neural network has learned, various studies for explaining neural networks have emerged to overcome this problem. However, these studies generate the local explanation of a single instance rather than providing a generalized global interpretation of the neural network model itself. To overcome such drawbacks of the previous approaches, we propose the global interpretation method for the deep neural network through features of the model. We first analyzed the relationship between the input and hidden layers to represent the high-level features of the model, then interpreted the decision-making process of neural networks through high-level features. In addition, we applied network pruning techniques to make concise explanations and analyzed the effect of layer complexity on interpretability. We present experiments on the proposed approach using three different datasets and show that our approach could generate global explanations on deep neural network models with high accuracy and fidelity.

Highlights

The significant advantage of deep neural networks is that the upper layer can capture the high-level features of data based on the information acquired from the lower layer by stacking layers deeply
Unlike conventional machine learning techniques, which perform well while trained with hand-designed features extracted by humans, deep neural network models show decent performance even using low-level data directly because units in the upper layer can represent high-level features by information acquired from the lower layer [1]
To generate a global explanation for deep neural networks trained with an unstructured dataset, we propose a feature-based rule explanation (FEB-RE) method to visualize high-level features and provide a logical explanation for humans

Summary

Introduction

The significant advantage of deep neural networks is that the upper layer can capture the high-level features of data based on the information acquired from the lower layer by stacking layers deeply. These studies generate the local explanation of a single instance rather than providing a generalized global interpretation of the neural network model itself To overcome such drawbacks of the previous approaches, we propose the global interpretation method for the deep neural network through features of the model. Unlike conventional machine learning techniques, which perform well while trained with hand-designed features extracted by humans, deep neural network models show decent performance even using low-level data directly because units in the upper layer can represent high-level features by information acquired from the lower layer [1]. The basic idea of transfer learning is to import the network parameters from a trained model with a similar data domain This method makes it possible to skip the process of training high-level features from low-level data and build a new neural network from high-level features suitable for the desired application. Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations

Objectives

Methods

Results

Conclusion