Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization

Burcu Küçükoğlu,Marcel Van Gerven,Bodo Rueckauer,Nasir Ahmad,Walraaf Borkent,Umut Güçlü

doi:10.51628/001c.123366

Abstract

Advances in reinforcement learning (RL) often rely on massive compute resources and remain notoriously sample inefficient. In contrast, the human brain is able to efficiently learn effective control strategies using limited resources. This raises the question whether insights from neuroscience can be used to improve current RL methods. Predictive processing is a popular theoretical framework which maintains that the human brain is actively seeking to minimize surprise. We show that recurrent neural networks which predict their own sensory states can be leveraged to minimise surprise, yielding substantial gains in cumulative reward. Specifically, we present the Predictive Processing Proximal Policy Optimization (P4O) agent; an actor-critic reinforcement learning agent that applies predictive processing to a recurrent variant of the PPO algorithm by integrating a world model in its hidden state. Even without hyperparameter tuning, P4O significantly outperforms a baseline recurrent variant of the PPO algorithm on multiple Atari games using a single GPU. It also outperforms other state-of-the-art agents given the same wall-clock time and exceeds human gamer performance on multiple games including Seaquest, which is a particularly challenging environment in the Atari domain. Altogether, our work underscores how insights from the field of neuroscience may support the development of more capable and efficient artificial agents.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization

Abstract

Talk to us

Similar Papers

More From: Neurons, Behavior, Data analysis, and Theory

Lead the way for us

Journal: Neurons, Behavior, Data analysis, and Theory	Publication Date: Sep 4, 2024
License type: CC BY 4.0

Similar Papers

A standardised open science framework for sharing and re-analysing neural data acquired to continuous stimuli
Giovanni M Di Liberto ... Giacomo Baruzzo
Neurons, Behavior, Data analysis, and Theory | VOL. -
Giovanni M Di Liberto, et. al.Giovanni M Di Liberto ... Giacomo Baruzzo
16 Oct 2024
Neurons, Behavior, Data analysis, and Theory | VOL. -

Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization
Burcu Küçükoğlu ... Umut Güçlü
Neurons, Behavior, Data analysis, and Theory | VOL. -
Burcu Küçükoğlu, et. al.Burcu Küçükoğlu ... Umut Güçlü
04 Sep 2024
Neurons, Behavior, Data analysis, and Theory | VOL. -

Artificial intelligence is algorithmic mimicry: why artificial “agents” are not (and won’t be) proper agents
Johannes Jaeger
Neurons, Behavior, Data analysis, and Theory | VOL. -
Johannes JaegerJohannes Jaeger
27 Feb 2024
Neurons, Behavior, Data analysis, and Theory | VOL. -

Visuomotor feedback tuning in the absence of visual error information
Sae Franklin ... David W Franklin
Neurons, Behavior, Data analysis, and Theory | VOL. -
Sae Franklin, et. al.Sae Franklin ... David W Franklin
15 Dec 2023
Neurons, Behavior, Data analysis, and Theory | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Deep Reinforcement Learning with Predictive Processing Proximal Policy Optimization

Abstract

Talk to us

Similar Papers

More From: Neurons, Behavior, Data analysis, and Theory