Stimulus Selection in a Q-learning Model Using Fisher Information and Monte Carlo Simulation

Kazuya Fujita,Kensuke Okada,Kentaro Katahira

doi:10.1007/s42113-022-00163-0

Abstract

Reinforcement learning models have been extensively studied for decision-making tasks with reward feedback. However, in designing an experiment to collect data for Q-learning models, the quantitative effect of a presented stimulus on the estimation precision of participant parameters has generally not been considered. That is, the lack of a mathematical framework has prevented researchers from designing an optimal experiment. To tackle this problem, this study analytically derives the Fisher information. Furthermore, this study formulates a stochastic representation of the Q-learning model, which is one of the most commonly applied reinforcement learning models. With this derivation, a two-step procedure is proposed to select the optimal stimuli in terms of estimation precision, in which low-cost Fisher information evaluation and more detailed finite-sample Monte Carlo simulation are combined. The simulation studies show that reward probability reversal leads to a high estimation precision for the learning rate parameter. By contrast, for the inverse temperature parameter, a larger difference in reward probability between options leads to higher estimation precision. These results reveal that the optimal experimental design is dependent on which trait parameters of the Q-learning model are of interest to researchers. Further, it is found that the use of undesirable stimuli in terms of trait parameter precision leads to a large bias in the correlation coefficient estimate. Based on the results, the approaches to designing experiments in the Q-learning model are discussed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stimulus Selection in a Q-learning Model Using Fisher Information and Monte Carlo Simulation

Abstract

Talk to us

Similar Papers

More From: Computational Brain & Behavior

Lead the way for us

Similar Papers

Confidence Interval For The Estimation of The Correlation Coefficient

JOURNAL OF ADVANCES IN MATHEMATICS | VOL. 11

24 Jul 2015
JOURNAL OF ADVANCES IN MATHEMATICS | VOL. 11

Author response: DYT1 dystonia increases risk taking in humans
David Arkadir ... Susan B Bressman
-
David Arkadir, et. al.David Arkadir ... Susan B Bressman
26 Apr 2016
26 Apr 2016

Author response: On the normative advantages of dopamine and striatal opponency for learning and choice
Alana Jaskir ... Michael J Frank
-
Alana Jaskir, et. al.Alana Jaskir ... Michael J Frank
14 Feb 2023
14 Feb 2023

Influences of Reinforcement and Choice Histories on Choice Behavior in Actor-Critic Learning
Kentaro Katahira ... Kenta Kimura
Computational Brain & Behavior | VOL. 6
Kentaro Katahira, et. al.Kentaro Katahira ... Kenta Kimura
11 Jul 2022
Computational Brain & Behavior | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stimulus Selection in a Q-learning Model Using Fisher Information and Monte Carlo Simulation

Abstract

Talk to us

Similar Papers

More From: Computational Brain &amp; Behavior

More From: Computational Brain & Behavior