Data mining for state space orthogonalization in adaptive dynamic programming

Bancha Ariyajunya,Ying Chen,Seoung Bum Kim,Victoria C.P. Chen

doi:10.1016/j.eswa.2017.01.020

Abstract

Dynamic programming (DP) is a mathematical programming approach for optimizing a system that changes over time and is a common approach for developing intelligent systems. Expert systems that are intelligent must be able to adapt dynamically over time. An optimal DP policy identifies the optimal decision dependent on the current state of the system. Hence, the decisions controlling the system can intelligently adapt to changing system states. Although DP has existed since Bellman introduced it in 1957, exact DP policies are only possible for problems with low dimension or under very limiting restrictions. Fortunately, advances in computational power have given rise to approximate DP (ADP). However, most ADP algorithms are still computationally-intractable for high-dimensional problems. This paper specifically considers continuous-state DP problems in which the state variables are multicollinear. The issue of multicollinearity is currently ignored in the ADP literature, but in the statistics community it is well known that high multicollinearity leads to unstable (high variance) parameter estimates in statistical modeling. While not all real world DP applications involve high multicollinearity, it is not uncommon for real cases to involve observed state variables that are correlated, such as the air quality ozone pollution application studied in this research. Correlation is a common occurrence in observed data, including sources in meteorology, energy, finance, manufacturing, health care, etc.ADP algorithms for continuous-state DP achieve an approximate solution through discretization of the state space and model approximations. Typical state space discretizations involve full-dimensional grids or random sampling. The former option requires exponential growth in the number of state points as the state space dimension grows, while the latter option is typically inefficient and requires an intractable number of state points. The exception is computationally-tractable ADP methods based on a design and analysis of computer experiments (DACE) approach. However, the DACE approach utilizes ideal experimental designs that are (nearly) orthogonal, and a multicollinear state space will not be appropriately represented by such ideal experimental designs. While one could directly build approximations over the multicollinear state space, the issue of unstable model approximations remains unaddressed. Our approach for handling multicollinearity employs data mining methods for two purposes: (1) to reduce the dimensionality of a DP problem and (2) to orthogonalize a multicollinear DP state space and enable the use of a computationally-efficient DACE-based ADP approach. Our results demonstrate the risk of ignoring high multicollinearity, quantified by high variance inflation factors representing model instability. Our comparisons using an air quality ozone pollution case study provide guidance on combining feature selection and feature extraction to guarantee orthogonality while achieving over 95% dimension reduction and good model accuracy.

Full Text