Prospective Optimization.

Terrence J Sejnowski,Ralph J Greenspan,Sergei Gepshtein,Gary Lynch,Howard Poizner

doi:10.1109/jproc.2014.2314297

Abstract

Human performance approaches that of an ideal observer and optimal actor in some perceptual and motor tasks. These optimal abilities depend on the capacity of the cerebral cortex to store an immense amount of information and to flexibly make rapid decisions. However, behavior only approaches these limits after a long period of learning while the cerebral cortex interacts with the basal ganglia, an ancient part of the vertebrate brain that is responsible for learning sequences of actions directed toward achieving goals. Progress has been made in understanding the algorithms used by the brain during reinforcement learning, which is an online approximation of dynamic programming. Humans also make plans that depend on past experience by simulating different scenarios, which is called prospective optimization. The same brain structures in the cortex and basal ganglia that are active online during optimal behavior are also active offline during prospective optimization. The emergence of general principles and algorithms for goal-directed behavior has consequences for the development of autonomous devices in engineering applications.

Highlights

Bellman’s approach to optimizing a sequence of actions to reach a goal is based on known state transitions and payoffs [1]
The dorsal and ventral basal ganglia are heavily innervated by inputs from dopamine neurons from the substantia nigra pars compacta or ventral tegmental area, which are involved in rewards and reinforcement learning
Prospective optimization has become highly elaborated as the cortex and basal ganglia evolved to support increasingly longer time horizons and more complex behaviors

Summary

INTRODUCTION

Bellman’s approach to optimizing a sequence of actions to reach a goal is based on known state transitions and payoffs [1]. The temporal-differences algorithm in reinforcement learning is closely related to the Rescorla– Wagner model [3], [4], and approximates dynamic programming [5] This approach constructs a consistent value function for states and actions based on feedback from the environment. On the basis of the reward at the end of each game, TD-Gammon discovered new strategies that had eluded experts This illustrates the ability of reinforcement learning to solve the temporal credit assignment problem and learn complex strategies that lead to winning ways. We examine how brains form cognitive strategies by prospective optimization—planning future actions to optimize rewards These more advanced aspects of reinforcement learning have the potential to greatly enhance the performance of autonomous control systems

IDEAL OBSERVERS AND PERFORMERS

LEARNING WHERE TO LOOK

DOPAMINE NEURONS AND REWARD-PREDICTION ERROR

ELABORATIONS OF BASIC CIRCUITS AND PROSPECTIVE

A CONCEPTUAL FRAMEWORK FOR PROSPECTIVE OPTIMIZATION

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proceedings of the IEEE. Institute of Electrical and Electronics Engineers	Publication Date: May 1, 2014
Citations: 11	License type: cc-by

R Discovery Prime

R Discovery Prime

Prospective Optimization.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the IEEE. Institute of Electrical and Electronics Engineers

Lead the way for us

Similar Papers

A brainlike learning system with supervised, unsupervised, and reinforcement learning
Takafumi Sasakawa ... Jinglu Hu
Electrical Engineering in Japan | VOL. 162
Takafumi Sasakawa, et. al.Takafumi Sasakawa ... Jinglu Hu
27 Sep 2007
Electrical Engineering in Japan | VOL. 162

What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?
K Doya
Neural Networks | VOL. 12
K DoyaK Doya
24 Aug 1999
Neural Networks | VOL. 12

Numerical and length densities of microvessels in the human brain: Correlation with preferential orientation of microvessels in the cerebral cortex, subcortical grey matter and white matter, pons and cerebellum
Tereza Kubíková ... Zbyněk Tonar
Journal of Chemical Neuroanatomy | VOL. 88
Tereza Kubíková, et. al.Tereza Kubíková ... Zbyněk Tonar
04 Nov 2017
Journal of Chemical Neuroanatomy | VOL. 88

Author response: On the normative advantages of dopamine and striatal opponency for learning and choice
Alana Jaskir ... Michael J Frank
-
Alana Jaskir, et. al.Alana Jaskir ... Michael J Frank
14 Feb 2023
14 Feb 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prospective Optimization.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the IEEE. Institute of Electrical and Electronics Engineers