Neural embeddings: accurate and readable inferences based on semantic kernels

Danilo Croce,Roberto Basili,Daniele Rossini

doi:10.1017/s1351324919000238

Abstract

AbstractSentence embeddings are the suitable input vectors for the neural learning of a number of inferences about content and meaning. Similarity estimation, classification, emotional characterization of sentences as well as pragmatic tasks, such as question answering or dialogue, have largely demonstrated the effectiveness of vector embeddings to model semantics. Unfortunately, most of the above decisions are epistemologically opaque as for the limited interpretability of the acquired neural models based on the involved embeddings. We think that any effective approach to meaning representation should be at least epistemologically coherent. In this paper, we concentrate on thereadabilityof neural models, as a core property of any embedding technique consistent and effective in representing sentence meaning. In this perspective, this paper discusses a novel embedding technique (the Nyström methodology) that corresponds to the reconstruction of a sentence in a kernel space, inspired by rich semantic similarity metrics (a semantic kernel) rather than by a language model. In addition to being based on a kernel that captures grammatical and lexical semantic information, the proposed embedding can be used as the input vector of an effective neural learning architecture, calledKernel-based deep architectures(KDA). Finally, it also characterizesby designthe KDA explanatory capability, as the proposed embedding is derived from examples that are both human readable and labeled. This property is obtained by the integration of KDAs with an explanation methodology, calledlayer-wise relevance propagation (LRP), already proposed in image processing. The Nyström embeddings support here the automatic compilation of argumentations in favor or against a KDA inference, in form of an explanation: each decision can in fact be linked through LRP back to the real examples, that is, the landmarks linguistically related to the input instance. The KDA network output is explained via the analogy with the activated landmarks. Quantitative evaluation of the explanations shows that richer explanations based on semantic and syntagmatic structures characterize convincing arguments, as they effectively help the user in assessing whether or not to trust the machine decisions in different tasks, for example, Question Classification or Semantic Role Labeling. This confirms the epistemological benefit that Nyström embeddings may bring, as linguistically rich and meaningful representations for a variety of inference tasks.

Full Text