About distributing rewards to a rule with probabilistic state transition

Wataru Uemura Wataru Uemura

doi:10.1109/sice.2007.4421458

Wataru Uemura Wataru Uemura

https://doi.org/10.1109/sice.2007.4421458

Copy DOI

Export

Save

Cite

Publication Date: Sep 1, 2007

Citations: 2

Affiliation: Ryukoku University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Profit Sharing is one of the reinforcement learning methods. On Profit Sharing, an agent as a learner distributes rewards to rules selected by the agent after reaching a goal state. If there is a non-deterministic state transition rule, for example a probabilistic one, an agent must consider the estimate value of its rule with the probabilistic state transition. Conventional Profit Sharing does not consider the probabilistic state transition because it distributes same rewards even if the state transition probability is 10%, 1%, and so on. In this paper, we propose the novel Profit Sharing method which considers the probabilistic state transition. In the environment with deterministic state transitions, we show the same performance both the conventional Profit Sharing and proposed Profit Sharing. And show the good performance of proposed Profit Sharing against the conventional Profit Sharing.

Full Text