The inner and outer approaches to the design of recursive neural architectures

Pierre Baldi

doi:10.1007/s10618-017-0531-0

Abstract

Feedforward neural network architectures work well for numerical data of fixed size, such as images. For variable size, structured data, such as sequences, d dimensional grids, trees, and other graphs, recursive architectures must be used. We distinguish two general approaches for the design of recursive architectures in deep learning, the inner and the outer approach. The inner approach uses neural networks recursively inside the data graphs, essentially to “crawl” the edges of the graphs in order to compute the final output. It requires acyclic orientations of the underlying graphs. The outer approach uses neural networks recursively outside the data graphs and regardless of their orientation. These neural networks operate orthogonally to the data graph and progressively “fold” or aggregate the input structure to produce the final output. The distinction is illustrated using several examples from the fields of natural language processing, chemoinformatics, and bioinformatics, and applied to the problem of learning from variable-size sets.

Highlights

Many problems in machine learning involve data items represented by vectors or tensors of fixed size
The data items come with a structure often represented by a graph. This is the case of sentences or parse trees in natural language processing, molecules or reactions in chemoinformatics, or nucleotide or amino acide sequences in bioinformatics and their two-dimensional contact maps
We present a classification of the known approaches into two basic classes: the inner class and the outer class

Summary

Introduction

Many problems in machine learning involve data items represented by vectors or tensors of fixed size. This is the case, for instance, in computer vision with images. A recursive network is a network that contains connection weights, often entire subnetworks, that are shared, often in a systematic way (Fig. 1). A siamese network can be called recursive, it may combine only two copies of the same network. Any recurrent network unfolded in time yields a highly recursive network. The notion of recursive network becomes most useful when there is a regular pattern of connections, associated for instance with a lattice (see examples below)

The inner approach

The outer approach

The problem of learning from sets

Discussion