Abstract

Recent advances in Symbolic Dynamic Programming (SDP) combined withthe extended algebraic decision diagram (XADD) have provided exactsolutions for expressive subclasses of finite-horizon Hybrid MarkovDecision Processes (HMDPs) with mixed continuous and discrete stateand action parameters. Unfortunately, SDP suffers from two majordrawbacks: (1) it solves for all states and can be intractable formany problems that inherently have large optimal XADD value functionrepresentations; and (2) it cannot maintain compact (pruned) XADDrepresentations for domains with nonlinear dynamics and reward due tothe need for nonlinear constraint checking. In this work, wesimultaneously address both of these problems by introducing real-timeSDP (RTSDP). RTSDP addresses (1) by focusing the solution and valuerepresentation only on regions reachable from a set of initial statesand RTSDP addresses (2) by using visited states as witnesses ofreachable regions to assist in pruning irrelevant or unreachable(nonlinear) regions of the value function. To this end, RTSDP enjoysprovable convergence over the set of initial states and substantialspace and time savings over SDP as we demonstrate in a variety of hybrid domains ranging from inventory to reservoir to traffic control.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.