Adaptive value function approximation for continuous-state stochastic dynamic programming

Huiyuan Fan,Prashant K Tarun,Victoria C.P Chen

doi:10.1016/j.cor.2012.11.016

Abstract

Approximate dynamic programming (ADP) commonly employs value function approximation to numerically solve complex dynamic programming problems. A statistical perspective of value function approximation employs a design and analysis of computer experiments (DACE) approach, where the “computer experiment” yields points on the value function curve. The DACE approach has been used to numerically solve high-dimensional, continuous-state stochastic dynamic programming, and performs two tasks primarily: (1) design of experiments and (2) statistical modeling. The use of design of experiments enables more efficient discretization. However, identifying the appropriate sample size is not straightforward. Furthermore, identifying the appropriate model structure is a well-known problem in the field of statistics. In this paper, we present a sequential method that can adaptively determine both sample size and model structure. Number-theoretic methods (NTM) are used to sequentially grow the experimental design because of their ability to fill the design space. Feed-forward neural networks (NNs) are used for statistical modeling because of their adjustability in structure-complexity . This adaptive value function approximation (AVFA) method must be automated to enable efficient implementation within ADP. An AVFA algorithm is introduced, that increments the size of the state space training data in each sequential step, and for each sample size a successive model search process is performed to find an optimal NN model. The new algorithm is tested on a nine-dimensional inventory forecasting problem.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Adaptive value function approximation for continuous-state stochastic dynamic programming

Abstract

Talk to us

Similar Papers

More From: Computers and Operations Research

Lead the way for us

Journal: Computers and Operations Research	Publication Date: Nov 23, 2012
Citations: 11

Similar Papers

Efficient approximate dynamic programming based on design and analysis of computer experiments for infinite-horizon optimization
Ying Chen ... Yuan Zhou
Computers & Operations Research | VOL. 124
Ying Chen, et. al.Ying Chen ... Yuan Zhou
24 Jun 2020
Computers & Operations Research | VOL. 124

Data mining for state space orthogonalization in adaptive dynamic programming
Bancha Ariyajunya ... Seoung Bum Kim
Expert Systems With Applications | VOL. 76
Bancha Ariyajunya, et. al.Bancha Ariyajunya ... Seoung Bum Kim
28 Jan 2017
Expert Systems With Applications | VOL. 76

Addressing state space multicollinearity in solving an ozone pollution dynamic control problem
Bancha Ariyajunya ... Jay Rosenberger
European Journal of Operational Research | VOL. 289
Bancha Ariyajunya, et. al.Bancha Ariyajunya ... Jay Rosenberger
12 Jul 2020
European Journal of Operational Research | VOL. 289

Energy management of PV-storage systems: ADP approach with temporal difference learning
Chanaka Keerthisinghe ... Gregor Verbic
-
Chanaka Keerthisinghe, et. al.Chanaka Keerthisinghe ... Gregor Verbic
01 Jun 2016
01 Jun 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adaptive value function approximation for continuous-state stochastic dynamic programming

Abstract

Talk to us

Similar Papers

More From: Computers and Operations Research