Working Memory Networks: Augmenting Memory Networks with a Relational Reasoning Module

Juan Pavez,Héctor Allende-Cid,Héctor Allende

doi:10.18653/v1/p18-1092

Abstract

During the last years, there has been a lot of interest in achieving some kind of complex reasoning using deep neural networks. To do that, models like Memory Networks (MemNNs) have combined external memory storages and attention mechanisms. These architectures, however, lack of more complex reasoning mechanisms that could allow, for instance, relational reasoning. Relation Networks (RNs), on the other hand, have shown outstanding results in relational reasoning tasks. Unfortunately, their computational cost grows quadratically with the number of memories, something prohibitive for larger problems. To solve these issues, we introduce the Working Memory Network, a MemNN architecture with a novel working memory storage and reasoning module. Our model retains the relational reasoning abilities of the RN while reducing its computational complexity from quadratic to linear. We tested our model on the text QA dataset bAbI and the visual QA dataset NLVR. In the jointly trained bAbI-10k, we set a new state-of-the-art, achieving a mean error of less than 0.5%. Moreover, a simple ensemble of two of our models solves all 20 tasks in the joint version of the benchmark.

Highlights

A central ability needed to solve daily tasks is complex reasoning
The Relation Networks (RNs) must perform O(n2) forward and backward passes
We have proposed a novel Working Memory Network architecture that introduces improved reasoning abilities to the original MemNN model

Summary

Introduction

A central ability needed to solve daily tasks is complex reasoning. It involves the capacity to comprehend and represent the environment, retain information from past experiences, and solve problems based on the stored information. Unlike symbolic approaches to complex reasoning, deep neural networks can learn representations from perceptual information Because of that, they do not suffer from the symbol grounding problem (Harnad, 1999), and can generalize better than classical symbolic approaches. They do not suffer from the symbol grounding problem (Harnad, 1999), and can generalize better than classical symbolic approaches Most of these neural network models make use of an explicit memory storage and an attention mechanism. After that some memories have been attended, using a multi-step procedure, the attended memories are combined and passed through a simple output layer that produces a final answer While this allows some multi-step inferential process, these networks lack a more complex reasoning mechanism, needed for more elaborated tasks such as inferring relations among entities (relational reasoning). To solve these problems we propose a novel Memory Network

Methods

Results

Conclusion