Abstract The present paper proposes a new method called “neural self-compressors” to compress multi-layered neural networks into the simplest possible ones (i.e., without hidden layers) to aid in the interpretation of relations between inputs and outputs. Though neural networks have shown great success in improving generalization, the interpretation of internal representations becomes a serious problem as the number of hidden layers and their corresponding connection weights becomes larger and larger. To overcome this interpretation problem, we introduce a method that compresses multi-layered neural networks into ones without hidden layers. In addition, this method simplifies entangled weights as much as possible by maximizing mutual information between inputs and outputs. In this way, final connection weights can be interpreted as easily as by the logistic regression analysis. The method was applied to four data sets: a symmetric data set, ovarian cancer data set, restaurant data set, and credit card holders’ default data set. In the first set, the symmetric data set, we tried to explain how the present method could produce interpretable outputs intuitively. In all the other cases, we succeeded in compressing multi-layered neural networks into their simplest forms with the help of mutual information maximization. In addition, by de-correlating outputs, we were able to transform connection weights from those close to the regression coefficients to ones with more explicit features.
Read full abstract