A study of feature binding in artificial neural networks with sigmoid and complex activation functions

Hanna Majewski

doi:10.14264/uql.2019.503

Abstract

The issue of correctly binding component parts of an object and not mistaking them with the features belonging to other objects is called the binding problem, and the phenomenon of mistakenly confusing features of multiple objects is known as a qghost effectq. In the real world, the brain needs to analyse the world into its component parts, store them in this way and then bind these parts to make whole entities and events for recognition and recall.Various methods have been proposed for binding the features of visual objects. The simplest is qlocal representationsq in which each item is represented by an individual unit or neuron. This method, although straightforward and simple, requires an exponential number of neurons and does not support generalization. A more complex method is to use associations to bind together pairs of components of an object. Such pairs with associations fail to capture the multiple bindings required for representing objects. Higher order binding can be represented using spatio- temporal correlations in which phase is used to bind together the features of an object.These representational methods differ in their capacity to encode binding information. In this thesis, simulations were used to study neural networks tested on a variety of binding tasks. The initial study involves mapping from two variables to one variable in a single time step. The simulations show that the networks are able to make simple associations between features but not temporal ones. In the second study, a sequential binding task is presented which requires a recurrent neural network to translate from a feature-based to a combinatorial scene-based representation. The mechanisms for binding information also depend on representational capacity. Binding information is easily carried by phase, but is not usually a component of neural network models. We propose a complex version of backpropagation for use with complex domain recurrent networks and assess the resources and requirements of the Simple Recurrent Network (SRN) and the Complex Domain Recurrent Network (CDRN) in simulations of the sequential binding task. Simulations demonstrate the improved performance and capacity of the CDRN.This thesis investigates the question whether multi-layer Artificial Neural Networks (ANN) are able to construct internal representations in order to maintain visual binding information and what are the possible mechanisms to encode binding information of the multiple objects. Specific questions raised:1. What is the behaviour of the network when translating two variables to one variable, for example, features lcolour, shapeg to object lcolour-shapeg (e.g.., the colour and shape of an object are integrated to form one coherent object) (FFN) ?2. What is the behaviour of the network when information arrives over several time steps, for example, two variables and three time steps (e.g.., three objects, one each time step, each one comprised of two variables) (SRN and CDRN) ?3. How well does a three layer ANN perform spatial and temporal binding (SRN and CDRN) ?4. What are the limitations of three layer ANN for spatial and temporal binding (SRN and CDRN) ?Analyses (using Principal Components Analysis and Canonical Discriminants Analysis) of the hidden unit space of the FFN, SRN, and CDRN showed well organized binding structure at the level of features and single objects, representing spatial binding, and at the level of multiple scenes representing temporal binding.The studies show that capacity is a key problem for all the networks as the number of features to be bound together increases. What differs between the networks is how quickly these networks reach their representational limits. The representational limits are the result of the architectural limitations of the networks and also the learning algorithms used for testing the networks with the specific tasks. This study has shown that the analyses of the internal representations of the networks, and the conclusions drawn from the observations, play a very important role in the design of the architectures and learning algorithms of the new networks.

Full Text