Cooperation/supervision of a habit by a cognitive strategy in a goal-directed navigational paradigm

Souheïl Hanoune,Philippe Gaussier,Jean-Paul Banquet,Mathias Quoy

doi:10.1186/1471-2202-16-s1-p200

Abstract

The Stimulus-Response (S-R) theory and Tolman's Cognitive Theory of behavior control both issued from behaviorism in the early XXth century still provide a relevant general framework to account for animal reward-based adaptive behavior. In this paper, we propose a new paradigm for representing and implementing both the cognitive strategy and the S-R habit strategy within a unitary coding frame. Based on a parallel learning of both strategies, the model explains how the fast learning cognitive strategy can supervise and accelerate the slow learning S-R habit strategy; and also how. In late learning stages, the habit strategy can overcome the cognitive. This parallel representation is inspired by the cortico-basal functional loops [1] and the cooperation between the cognitive associative loop, including the dorso-medial striatum and the mPF; and the sensory-motor loop, associated to the sensory motor cortex in relation with the dorso-lateral striatum. The implementation of S-R habit strategy is based on a neural modified version of the classical Q-learning and is based on the model of [2], emulating the functioning of the sensory-motor loop. The states of the model are represented by hippocampal transitions, representing associations between two consecutive place-cells during the exploration of the environment, learned in the CA1-CA3 regions of the hippocampus. The cognitive strategy is based on a map representation of the environment namely the cognitive map [3]. Based on the association between learned transitions, the cognitive map allows the back-propagation of a reward within a tree, allowing the selection of the shortest path to the goal. While the cognitive map is quickly learned, the Q-values associated with the Q-learning are slower to acquire. On the other hand, the Q-learning tends to be more accurate than the cognitive map when fully learned. The model exploits this speed difference in its parallel learning. The fast acquisition of the cognitive map allows the robot to quickly choose correct paths to the goal, and thus the time convergence of the Q-learning algorithm is optimized. The cooperation is based on the biasing of the selected transition by the cognitive map and the Q-learning in parallel (see Fig 1). In its early learning stage, the Q-learning biasing is too weak, and the cognitive map is dominant (Fig2. VS Fig2.b), inducing the supervision of the S-R habit by the cognitive strategy. In the later learning stages, the Qlearning is stronger and more precise. Cooperation of the cognitive strategy and S-R habit enables a faster S-R learning; as shown in Fig 2. The lesion studies (Fig2.c, Fig2.d) show that the system maintains a coherent behavior event after the lesion of either of the structures supporting the two strategies. Also, the time responses highlight the superiority of the habit strategy after over-training (Fig2.c VS Fig2.d).

Highlights

The Stimulus-Response (S-R) theory and Tolman’s Cognitive Theory of behavior control both issued from behaviorism in the early 20th century still provide a relevant general framework to account for animal reward-based adaptive behavior
The cognitive strategy is based on a map representation of the environment namely the cognitive map [3]
Based on the association between learned transitions, the cognitive map allows the back-propagation of a reward within a

Summary

Introduction

The Stimulus-Response (S-R) theory and Tolman’s Cognitive Theory of behavior control both issued from behaviorism in the early 20th century still provide a relevant general framework to account for animal reward-based adaptive behavior. The implementation of S-R habit strategy is based on a neural modified version of the classical Q-learning and is based on the model of [2], emulating the functioning of the sensory-motor loop. The states of the model are represented by hippocampal transitions, representing associations between two consecutive place-cells during the exploration of the environment, learned in the CA1-CA3 regions of the hippocampus.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cooperation/supervision of a habit by a cognitive strategy in a goal-directed navigational paradigm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC neuroscience

Lead the way for us

Journal: BMC neuroscience	Publication Date: Dec 1, 2015
License type: cc-by

Similar Papers

Discernible effects of tDCS over the primary motor and posterior parietal cortex on different stages of motor learning.
Guadalupe Nathzidy Rivera-Urbina ... Michael A Nitsche
Brain Structure and Function | VOL. 227
Guadalupe Nathzidy Rivera-Urbina, et. al.Guadalupe Nathzidy Rivera-Urbina ... Michael A Nitsche
17 Jan 2022
Brain Structure and Function | VOL. 227

학습 공간으로서 온라인 독서실의 특징에 관한 연구
Kyunglee Kang
Korean Association For Learner-Centered Curriculum And Instruction | VOL. 23
Kyunglee KangKyunglee Kang
31 Jul 2023
Korean Association For Learner-Centered Curriculum And Instruction | VOL. 23

Dropout management in online learning systems
Rupanwita Dash ... Alexander Rossmann
Behaviour and Information Technology | VOL. 41
Rupanwita Dash, et. al.Rupanwita Dash ... Alexander Rossmann
08 Apr 2021
Behaviour and Information Technology | VOL. 41

Dynamic Changes of Arc Expression in Dorsal Striatum of Mice After Self-Administration of Sucrose.
Xue Li ... Ti-Fei Yuan
Frontiers in Cellullar Neuroscience | VOL. 15
Xue Li, et. al.Xue Li ... Ti-Fei Yuan
19 May 2021
Frontiers in Cellullar Neuroscience | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cooperation/supervision of a habit by a cognitive strategy in a goal-directed navigational paradigm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC neuroscience