Abstract
When dealing with numerical solution of stochastic optimal control problems, stochastic dynamic programming is the natural framework. In order to try to overcome the so-called curse of dimensionality, the stochastic programming school promoted another approach based on scenario trees which can be seen as the combination of Monte Carlo sampling ideas on the one hand, and of a heuristic technique to handle causality (or nonanticipativeness) constraints on the other hand.However, if one considers that the solution of a stochastic optimal control problem is a feedback law which relates control to state variables, the numerical resolution of the optimization problem over a scenario tree should be completed by a feedback synthesis stage in which, at each time step of the scenario tree, control values at nodes are plotted against corresponding state values to provide a first discrete shape of this feedback law from which a continuous function can be finally inferred. From this point of view, the scenario tree approach faces an important difficulty: at the first time stages (close to the tree root), there are a few nodes (or Monte-Carlo particles), and therefore a relatively scarce amount of information to guess a feedback law, but this information is generally of a good quality (that is, viewed as a set of control value estimates for some particular state values, it has a small variance because the future of those nodes is rich enough); on the contrary, at the final time stages (near the tree leaves), the number of nodes increases but the variance gets large because the future of each node gets poor (and sometimes even deterministic).After this dilemma has been confirmed by numerical experiments, we have tried to derive new variational approaches. First of all, two different formulations of the essential constraint of nonanticipativeness are considered: one is called algebraic and the other one is called functional. Next, in both settings, we obtain optimality conditions for the corresponding optimal control problem. For the numerical resolution of those optimality conditions, an adaptive mesh discretization method is used in the state space in order to provide information for feedback synthesis. This mesh is naturally derived from a bunch of sample noise trajectories which need not to be put into the form of a tree prior to numerical resolution. In particular, an important consequence of this discrepancy with the scenario tree approach is that the same number of nodes (or points) are available from the beginning to the end of the time horizon. And this will be obtained without sacrifying the quality of the results (that is, the variance of the estimates). Results of experiments with a hydro-electric dam production management problem will be presented and will demonstrate the claimed improvements. A more realistic problem will also be presented in order to demonstrate the effectiveness of the method for high dimensional problems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.