Abstract

For an intelligent agent to be truly autonomous, it must be able to adapt its representation to the requirements of its task as it interacts with the world. Most current approaches to on-line feature extraction are ad hoc; in contrast, this paper derives principled criteria for representational adequacy by applying the psychological principle of cognitive economy to reinforcement learning. The criteria are principled because they are based on an analysis of the amount of reward the agent forfeits when it generalizes over states. This analysis shows that action-value errors are sometimes irrelevant, and that the agent may optimize its performance with limited cognitive resources by grouping together states whose differences do not matter in its task. The paper presents an algorithm based on this analysis, incorporating an active form of Q-learning and partitioning continuous state-spaces by merging and splitting Voronoi regions. The experiments illustrate a new methodology for testing and comparing representations by means of learning curves. Results from the puck-on-a-hill task demonstrate the algorithm’s ability to learn effective representations, superior to those produced by some other, well-known, methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.