Truncated Approximate Dynamic Programming with Task-Dependent Terminal Value

Amir-Massoud Farahmand,Yuji Igarashi,Hiroki Konaka,Daniel Nikovski

doi:10.1609/aaai.v30i1.10397

Abstract

We propose a new class of computationally fast algorithms to find close to optimal policy for Markov Decision Processes (MDP) with large finite horizon T.The main idea is that instead of planning until the time horizon T, we plan only up to a truncated horizon H << T and use an estimate of the true optimal value function as the terminal value. Our approach of finding the terminal value function is to learn a mapping from an MDP to its value function by solving many similar MDPs during a training phase and fit a regression estimator. We analyze the method by providing an error propagation theorem that shows the effect of various sources of errors to the quality of the solution. We also empirically validate this approach in a real-world application of designing an energy management system for Hybrid Electric Vehicles with promising results.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Truncated Approximate Dynamic Programming with Task-Dependent Terminal Value

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Mar 5, 2016
Citations: 4

Similar Papers

Identifying effective policies in approximate dynamic programming: beyond regression
...
-
, et. al. ...
05 Dec 2010
05 Dec 2010

Identifying effective policies in approximate dynamic programming: Beyond regression
...
-
, et. al. ...
01 Dec 2010
01 Dec 2010

Novel energy management system for hybrid electric vehicles utilizing car navigation over a commuting route
S Ichikawa ... N Miki
-
S Ichikawa, et. al.S Ichikawa ... N Miki
14 Jun 2004
14 Jun 2004

A Natural Language Argumentation Interface for Explanation Generation in Markov Decision Processes
Thomas Dodson ... Judy Goldsmith
-
Thomas Dodson, et. al.Thomas Dodson ... Judy Goldsmith
01 Jan 2010
01 Jan 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Truncated Approximate Dynamic Programming with Task-Dependent Terminal Value

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence