Explaining the black-box model: A survey of local interpretation methods for deep neural networks

Yu Liang,Siguang Li,Chungang Yan,Maozhen Li,Changjun Jiang

doi:10.1016/j.neucom.2020.08.011

Yu Liang, Siguang Li + Show 3 more

Open Access

https://doi.org/10.1016/j.neucom.2020.08.011

Copy DOI

Journal: Neurocomputing	Publication Date: Sep 3, 2020
Citations: 109	License type: publisher-specific-oa

Affiliation: Tongji University, Binzhou University

Abstract

Recently, a significant amount of research has been investigated on interpretation of deep neural networks (DNNs) which are normally processed as black box models. Among the methods that have been developed, local interpretation methods stand out which have the features of clear expression in interpretation and low computation complexity. Different from existing surveys which cover a broad range of methods on interpretation of DNNs, this survey focuses on local interpretation methods with an in-depth analysis of the representative works including the newly proposed approaches. From the perspective of principles, we first divide local interpretation methods into two main categories: model-driven methods and data-driven methods. Then we make a fine-grained distinction between the two types of these methods, and highlight the latest ideas and principles. We further demonstrate the effects of a number of interpretation methods by reproducing the results through open source software plugins. Finally, we point out research directions in this rapidly evolving field.

Full Text