Multimodal Representation Learning for Place Recognition Using Deep Hebbian Predictive Coding.

Martin J Pearson,Ville Kyrki,Oliver Struckmeier,Ben Mitchinson,Thomas C Knowles,Cyriel M.A Pennartz,Sander Bohte,Kshitij Tiwari,Shirin Dora

doi:10.3389/frobt.2021.732023

Abstract

Recognising familiar places is a competence required in many engineering applications that interact with the real world such as robot navigation. Combining information from different sensory sources promotes robustness and accuracy of place recognition. However, mismatch in data registration, dimensionality, and timing between modalities remain challenging problems in multisensory place recognition. Spurious data generated by sensor drop-out in multisensory environments is particularly problematic and often resolved through adhoc and brittle solutions. An effective approach to these problems is demonstrated by animals as they gracefully move through the world. Therefore, we take a neuro-ethological approach by adopting self-supervised representation learning based on a neuroscientific model of visual cortex known as predictive coding. We demonstrate how this parsimonious network algorithm which is trained using a local learning rule can be extended to combine visual and tactile sensory cues from a biomimetic robot as it naturally explores a visually aliased environment. The place recognition performance obtained using joint latent representations generated by the network is significantly better than contemporary representation learning techniques. Further, we see evidence of improved robustness at place recognition in face of unimodal sensor drop-out. The proposed multimodal deep predictive coding algorithm presented is also linearly extensible to accommodate more than two sensory modalities, thereby providing an intriguing example of the value of neuro-biologically plausible representation learning for multimodal navigation.

Highlights

The study of biology and the brain has inspired many innovative and robust solutions to hard problems in engineering
The large offset apparent in the tactile reconstruction errors from the Joint Multimodal Variational Autoencoders (VAE) (JMVAE)-zero and MultiPredNet models suggest that these models did not accommodate this disparity
Interpreting the representation space of MultiPredNet is, subject to further investigation, which reinforces the position that VAEs are certainly better understood machine learning tools and as such are the obvious choice for adoption by robotics engineers

Summary

Introduction

The study of biology and the brain has inspired many innovative and robust solutions to hard problems in engineering. Unsupervised learning in neural networks does not typically require a globally distributed error signal for training, instead they find and exacerbate patterns in the input space by learning correlations or through local competition, typically to enable a useful reduction in dimensionality These low dimensional latent representations of input are often used to perform clustering of complex data or serve as efficient pre-processing for a supervised or reinforcement learning back-end. A VAE is a generative modelling approach that uses variational inference methods for training with large-scale and high dimensional data sets Kingma and Welling (2013) This has been extended for learning bi-directional, joint distributions between different sensory modalities Suzuki et al (2017); Korthals et al (2019). The dimensionality, timing and registration, or reference frame, of these two sensory modalities are different, with the salient

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Robotics and AI	Publication Date: Dec 13, 2021
Citations: 20	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multimodal Representation Learning for Place Recognition Using Deep Hebbian Predictive Coding.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Robotics and AI

Lead the way for us

Similar Papers

3D LiDAR-Based Global Localization Using Siamese Neural Network
Huan Yin ... Xiaqing Ding
IEEE Transactions on Intelligent Transportation Systems | VOL. 21
Huan Yin, et. al.Huan Yin ... Xiaqing Ding
27 Jun 2019
IEEE Transactions on Intelligent Transportation Systems | VOL. 21

Non-parametric Representation Learning with Kernels
Pascal Esser ... Debarghya Ghoshdastidar
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38
Pascal Esser, et. al.Pascal Esser ... Debarghya Ghoshdastidar
24 Mar 2024
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 38

A Novel Multi-Task Self-Supervised Representation Learning Paradigm
Yinggang Li ... Qi Zhang
Control theory & applications | VOL. -
Yinggang Li, et. al.Yinggang Li ... Qi Zhang
28 May 2021
Control theory & applications | VOL. -

A Novel Solution for EEG-based Emotion Recognition
Zhuofan Xie ... Haixin Sun
-
Zhuofan Xie, et. al.Zhuofan Xie ... Haixin Sun
13 Oct 2021
13 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multimodal Representation Learning for Place Recognition Using Deep Hebbian Predictive Coding.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Robotics and AI