Identifying the Relevant Dependencies of the Neural Network Response on Characteristics of the Input Space

Stefan Wunsch,Raphael Friese,Günter Quast,Roger Wolf

doi:10.1007/s41781-018-0012-1

Stefan Wunsch, Raphael Friese + Show 2 more

Open Access

https://doi.org/10.1007/s41781-018-0012-1

Copy DOI

Abstract

The relation between the input and output spaces of neural networks (NNs) is investigated to identify those characteristics of the input space that have a large influence on the output for a given task. For this purpose, the NN function is decomposed into a Taylor expansion in each element of the input space. The Taylor coefficients contain information about the sensitivity of the NN response to the inputs. A metric is introduced that allows for the identification of the characteristics that mostly determine the performance of the NN in solving a given task. Finally, the capability of this metric to analyze the performance of the NN is evaluated based on a task common to data analyses in high-energy particle physics experiments.

Highlights

A neural network (NN) is a multi-parameter system, which, depending on its architecture, can consist of several thousands of weight and bias parameters, subject to one or more non-linear activation functions
While with this study we will demonstrate the application of the Taylor expansion only up to second order, we explicitly propose a generalization towards higher-order derivatives in the Taylor expansion to capture relations across variables, which usually play a more important role in data analyses in high-energy particle physics experiments
We have discussed the usage of the coefficients ti from a Taylor expansion in each element of the input space {xj} to identify the characteristics of the input space with the largest influence on the NN output

Summary

Introduction

A neural network (NN) is a multi-parameter system, which, depending on its architecture, can consist of several thousands of weight and bias parameters, subject to one or more non-linear activation functions. Each of these adjustable parameters obtains its concrete value and meaning by minimisation during the training process. Deviations need to be identified and quantified within the uncertainty model of the hypothesis test They may occur in the description of single input variables to the NN, and in correlations across input variables, even if the marginal distributions of the individual input variables are reproduced. To make sure that this performance gain is not feigned, in addition to the marginal distributions,

Page 2 of 7

Page 4 of 7

Summary

Compliance with ethical standards