Neuro-dynamic Programming Research Articles

The pH neutralization process has long been taken as a representative benchmark problem of nonlinear chemical process control due to its nonlinearity and time-varying nature. For general nonlinear processes, it is difficult to control with a linear model-based control method so nonlinear controls must be considered. Among the numerous approaches suggested, the most rigorous approach is the dynamic optimization. However, as the size of the problem grows, the dynamic programming approach suffers from the curse of dimensionality. In order to avoid this problem, the Neuro-Dynamic Programming (NDP) approach was proposed by Bertsekas and Tsitsiklis [1996]. The NDP approach is to utilize all the data collected to generate an approximation of optimal cost-to-go function which was used to find the optimal input movement in real time control. The approximation could be any type of function such as polynomials, neural networks, etc. In this study, an algorithm using NDP approach was applied to a pH neutralization process to investigate the feasibility of the NDP algorithm and to deepen the understanding of the basic characteristics of this algorithm. As the approximator, the neural network which requires training and the k-nearest neighbor method which requires querying instead of training are investigated. The approximator has to use data from the optimal control strategy. If the optimal control strategy is not readily available, a suboptimal control strategy can be used instead. However, the laborious Bellman iterations are necessary in this case. For pH neutralization process it is rather easy to devise an optimal control strategy. Thus, we used an optimal control strategy and did not perform the Bellman iteration. Also, the effects of constraints on control moves are studied. From the simulations, the NDP method outperforms the conventional PID control.

Read full abstract

The paper considers a version of the vehicle routing problem where customers’ demands are uncertain. The focus is on dynamically routing a single vehicle to serve the demands of a known set of geographically dispersed customers during real-time operations. The goal consists of minimizing the expected distance traveled in order to serve all customers’ demands. Since actual demand is revealed upon arrival of the vehicle at the location of each customer, fully exploiting this feature requires a dynamic approach. This work studies the suitability of the emerging field of neuro-dynamic programming (NDP) in providing approximate solutions to this difficult stochastic combinatorial optimization problem. The paper compares the performance of two NDP algorithms: optimistic approximate policy iteration and a rollout policy. While the former improves the performance of a nearest-neighbor policy by 2.3%, the computational results indicate that the rollout policy generates higher quality solutions. The implication for the practitioner is that the rollout policy is a promising candidate for vehicle routing applications where a dynamic approach is required. Scope and purpose Recent years have seen a growing interest in the development of vehicle routing algorithms to cope with the uncertain and dynamic situations found in real-world applications (see the recent survey paper by Powell et al. [1]). As noted by Psaraftis [2], dramatic advances in information and communication technologies provide new possibilities and opportunities for vehicle routing research and applications. The enhanced capability of capturing the information that becomes available during real-time operations opens up new research directions. This informational availability provides the possibility of developing dynamic routing algorithms that take advantage of the information that is dynamically revealed during operations. Exploiting such information presents a significant challenge to the operations research/management science community. The single vehicle routing problem with stochastic demands [3] provides an example of a simple, yet very difficult to solve exactly, dynamic vehicle routing problem [2, p. 157] . The problem can be formulated as a stochastic shortest path problem [4] characterized by an enormous number of states. Neuro-dynamic programming [5,6] is a recent methodology that can be used to approximately solve very large and complex stochastic decision and control problems. In this spirit, this paper is meant to study the applicability of neuro-dynamic programming algorithms to the single-vehicle routing problem with stochastic demands.

Read full abstract

Neuro-dynamic Programming Research Articles

Related Topics

Articles published on Neuro-dynamic Programming

STOCHASTIC APPROXIMATE SCHEDULING BY NEURODYNAMIC LEARNING

NEURO-DYNAMIC PROGRAMMING FOR THE EXPLORATION OF UNKNOWN GRAPHS

AN RBF BASED NEURO-DYNAMIC APPROACH FOR THE CONTROL OF STOCHASTIC DYNAMIC SYSTEMS

Control of pH neutralization process using simulation based dynamic programming

Fractionation in radiation treatment planning

Valuation of American Options via Basis Functions

Call admission control in cellular networks: A reinforcement learning solution

Optimization of a Fed-Batch Bioreactor Using Simulation-Based Approach

Adaptation and Learning in Distributed Production Control

From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning

Neuro dynamic programming algorithms for computing optimal control of production lines

Neuro-dynamic programming method for MPC

Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands

Call admission control and routing in integrated services networks using neuro-dynamic programming

Missile defense and interceptor allocation by neuro-dynamic programming

An Approximation Algorithm for Optimal Stopping

Neuro-Dynamic Programming [Book News & Reviews

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Neuro-dynamic Programming Research Articles

Related Topics

Articles published on Neuro-dynamic Programming

STOCHASTIC APPROXIMATE SCHEDULING BY NEURODYNAMIC LEARNING

NEURO-DYNAMIC PROGRAMMING FOR THE EXPLORATION OF UNKNOWN GRAPHS

AN RBF BASED NEURO-DYNAMIC APPROACH FOR THE CONTROL OF STOCHASTIC DYNAMIC SYSTEMS

Control of pH neutralization process using simulation based dynamic programming

Fractionation in radiation treatment planning

Valuation of American Options via Basis Functions

Call admission control in cellular networks: A reinforcement learning solution

Optimization of a Fed-Batch Bioreactor Using Simulation-Based Approach

Adaptation and Learning in Distributed Production Control

From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning

Neuro dynamic programming algorithms for computing optimal control of production lines

Neuro-dynamic programming method for MPC

Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands

Call admission control and routing in integrated services networks using neuro-dynamic programming

Missile defense and interceptor allocation by neuro-dynamic programming

An Approximation Algorithm for Optimal Stopping

Neuro-Dynamic Programming [Book News &amp; Reviews

Neuro-Dynamic Programming [Book News & Reviews