Abstract

Although an action observation network and mirror neurons for understanding the actions and intentions of others have been under deep, interdisciplinary consideration over recent years, it remains largely unknown how the brain manages to map visually perceived biological motion of others onto its own motor system. This paper shows how such a mapping may be established, even if the biologically motion is visually perceived from a new vantage point. We introduce a learning artificial neural network model and evaluate it on full body motion tracking recordings. The model implements an embodied, predictive inference approach. It first learns to correlate and segment multimodal sensory streams of own bodily motion. In doing so, it becomes able to anticipate motion progression, to complete missing modal information, and to self-generate learned motion sequences. When biological motion of another person is observed, this self-knowledge is utilized to recognize similar motion patterns and predict their progress. Due to the relative encodings, the model shows strong robustness in recognition despite observing rather large varieties of body morphology and posture dynamics. By additionally equipping the model with the capability to rotate its visual frame of reference, it is able to deduce the visual perspective onto the observed person, establishing full consistency to the embodied self-motion encodings by means of active inference. In further support of its neuro-cognitive plausibility, we also model typical bistable perceptions when crucial depth information is missing. In sum, the introduced neural model proposes a solution to the problem of how the human brain may establish correspondence between observed bodily motion and its own motor system, thus offering a mechanism that supports the development of mirror neurons.

Highlights

  • Neuroscience has labeled a distributed network of brain regions that appears to be involved in action understanding and social cognition the mirror neuron system (Rizzolatti and Craighero, 2004, 2005; Iacoboni and Dapretto, 2006; Kilner et al, 2007; Iacoboni, 2009)

  • Superior Temporal Sulcus (STS) is wellknown for encoding biological motion patterns (Bruce et al, 1981; Perrett et al, 1985; Oram and Perrett, 1994) and has been considered as an important visual modality for the development of attributes linked with the mirror neuron system (Grossman et al, 2000; Gallese, 2001; Puce and Perrett, 2003; Ulloa and Pineda, 2007; Pavlova, 2012; Cook et al, 2014)

  • We show that (1) the model is able to encode a realistic walking movement when both visual and proprioceptive stimuli are present during self-perception; (2) multiple movements each in multiple frames of reference can be encoded in mainly disjunct sets of motion patterns, and (3) that this enables the transformation of randomly oriented views of similar biological motion to the previously learned frames of reference upon observation and the ability to solve the correspondence problem and to derive others’ perspectives

Read more

Summary

Introduction

Neuroscience has labeled a distributed network of brain regions that appears to be involved in action understanding and social cognition the mirror neuron system (Rizzolatti and Craighero, 2004, 2005; Iacoboni and Dapretto, 2006; Kilner et al, 2007; Iacoboni, 2009). The existence of mirror neurons in the human brain as well as their primary role for action understanding is still controversial (see e.g., the discussion after Lingnau et al, 2009), the existence of such a network and the inclusion of our own motor system in this network is generally accepted It is still strongly disputed how this network may develop (Kilner and Lemon, 2013; Cook et al, 2014). The model is able to project visually perceived biological motion of others onto own action encodings, resulting in the co-activation of corresponding proprioceptive codes. In line with encodings of biological motion in STS, our learning algorithm is capable to encode visual motion redundantly in multiple orientations Those view-dependent encodings form perceptual attractor states, which may be compared with attractors found for object recognition, in which mental rotations are involved (Palmer et al, 1981; Tarr and Pinker, 1989).

Generative Neural Network Model Description
Temporal Pattern Learning
Experiments
Self-supervision and backpropagation
Related Work
Findings
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.