A Simple Convolutional Neural Network with Rule Extraction

Guido Bologna

doi:10.3390/app9122411

Abstract

Classification responses provided by Multi Layer Perceptrons (MLPs) can be explained by means of propositional rules. So far, many rule extraction techniques have been proposed for shallow MLPs, but not for Convolutional Neural Networks (CNNs). To fill this gap, this work presents a new rule extraction method applied to a typical CNN architecture used in Sentiment Analysis (SA). We focus on the textual data on which the CNN is trained with “tweets” of movie reviews. Its architecture includes an input layer representing words by “word embeddings”, a convolutional layer, a max-pooling layer, followed by a fully connected layer. Rule extraction is performed on the fully connected layer, with the help of the Discretized Interpretable Multi Layer Perceptron (DIMLP). This transparent MLP architecture allows us to generate symbolic rules, by precisely locating axis-parallel hyperplanes. Experiments based on cross-validation emphasize that our approach is more accurate than that based on SVMs and decision trees that substitute DIMLPs. Overall, rules reach high fidelity and the discriminative n-grams represented in the antecedents explain the classifications adequately. With several test examples we illustrate the n-grams represented in the activated rules. They present the particularity to contribute to the final classification with a certain intensity.

Highlights

Artificial neural networks learn by examining numerous examples many times
Experiments based on cross-validation emphasize that our approach is more accurate than that based on Support Vector Machines (SVMs) and decision trees that substitute Discretized Interpretable Multi Layer Perceptron (DIMLP)
Convolutional Neural Networks (CNNs) are compared to Support Vector Machines (SVMs) [40]

Summary

Introduction

Artificial neural networks learn by examining numerous examples many times. It is very difficult to explain their decisions, because their knowledge is embedded within the values of the parameters and neuron activations, which are at first glance incomprehensible. Deep neural networks are at the root of the significant progress accomplished over the past five years in areas such as artificial vision, natural language processing and speech recognition. A number of studies have been conducted to clarify the potential of deep models, such as Convolutional Neural Networks (CNNs) in Sentiment Analysis (SA) [1,2]. The transparency of bio-inspired models is currently an open and important research topic, as in the long term, the acceptance of these models will depend on it. Transparency is essential in relation to a recent European General Data Protection

Results

Discussion

Conclusion