Long-run Average Criterion Research Articles

Abstract In this paper, we consider a joint drift rate control and two-sided impulse control problem in which the system manager adjusts the drift rate as well as the instantaneous relocation for a Brownian motion, with the objective of minimizing the total average state-related cost and control cost. The system state can be negative. Assuming that instantaneous upward and downward relocations take a different cost structure, which consists of both a setup cost and a variable cost, we prove that the optimal control policy takes an $\left\{ {\!\left( {{s^{\ast}},{q^{\ast}},{Q^{\ast}},{S^{\ast}}} \right),\!\left\{ {{\mu ^{\ast}}(x)\,:\,x \in [ {{s^{\ast}},{S^{\ast}}}]} \right\}} \right\}$ form. Specifically, the optimal impulse control policy is characterized by a quadruple $\left( {{s^{\ast}},{q^{\ast}},{Q^{\ast}},{S^{\ast}}} \right)$ , under which the system state will be immediately relocated upwardly to ${q^{\ast}}$ once it drops to ${s^{\ast}}$ and be immediately relocated downwardly to ${Q^{\ast}}$ once it rises to ${S^{\ast}}$ ; the optimal drift rate control policy will depend solely on the current system state, which is characterized by a function ${\mu ^{\ast}}\!\left( \cdot \right)$ for the system state staying in $[ {{s^{\ast}},{S^{\ast}}}]$ . By analyzing an associated free boundary problem consisting of an ordinary differential equation and several free boundary conditions, we obtain these optimal policy parameters and show the optimality of the proposed policy using a lower-bound approach. Finally, we investigate the effect of the system parameters on the optimal policy parameters as well as the optimal system’s long-run average cost numerically.

Read full abstract

There is an abundance of real-life situations of the following nature. Several projects, of various types, need to be done by some person, machine, or other device. These require certain amounts of time, while costs (or rewards) are associated with performing such projects, delaying them, or switching between projects. The problem now is: Find a policy that determines, for given numbers of projects of the various types, on which project one should work, so as to minimize (discounted or average) costs. Examples of this problem turn up in administrative settings, in manufacturing, in computer-communications, etc. The problem of dynamically prioritizing projects (traffic classes) may often be formulated as a Markov decision problem, but the state space explosion makes the numerical solution of realistically sized problems usually prohibitive. An important class of heuristic policies, which often provide near-optimal solutions (and sometimes even can be shown to be optimal), is the class of priority index policies. A static index rule employed in minimizing completion costs in single-machine scheduling is the ratio of holding cost rate to processing time: work on the job with highest index. A dynamic index rule has been shown (Gittins and Jones (1974), as referenced in Nino-Mora 2007) to be optimal for the multi-armed bandit problem, viz., sequential allocation of work to a collection of stochastic projects (=bandits), so as to maximize the expected total discounted reward earned over an infinite horizon. Whittle (1988) discusses the multi-armed restless bandit problem, i.e., bandits can change state while being passive, under the long-run average criterion.

Read full abstract

Long-run Average Criterion Research Articles

Related Topics

Articles published on Long-run Average Criterion

Zero-Sum Non-stationary Stochastic Games with the Long-Run Average Criterion

Asymptotically optimal routing of a many-server parallel queueing system with long-run average criterion

Optimal drift rate control and two-sided impulse control for a Brownian system with the long-run average criterion

Learning the Best Price and Ordering Policy under Fixed Costs and Ambiguous Demand

LP Formulations of Discrete Time Long-Run Average Optimal Control Problems: The NonErgodic Case

Linear programming formulations of deterministic infinite horizon optimal control problems in discrete time

Performance analysis of a reflected fluid production/inventory model

A Fluid EOQ Model with Markovian Environment

A Fluid EOQ Model with Markovian Environment

A Fluid EOQ Model with Markovian Environment

Group replacement policies for a repairable cold standby system with fixed lead times

Policy Iteration Algorithms for Zero-Sum Stochastic Differential Games with Long-Run Average Payoff Criteria

A make-to-stock production/inventory model with MAP arrivals and phase-type demands

Average optimal strategies for zero-sum Markov games with poorly known payoff function on one side

Illustrated review of convergence conditions of the value iteration algorithm and the rolling horizon procedure for average-cost MDPs

Structural Properties of Optimal Scheduling Policies for Wireless Data Transmission

Semi-Infinite Weighted Markov Decision Processes

Maximisation of long-run average profit through dynamic control of production and pricing in a manufacturing system

Comments on: Dynamic priority allocation via restless bandit marginal productivity indices

Semi-infinite weighted Markov decision processes with perturbation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Long-run Average Criterion Research Articles

Related Topics

Articles published on Long-run Average Criterion

Zero-Sum Non-stationary Stochastic Games with the Long-Run Average Criterion

Asymptotically optimal routing of a many-server parallel queueing system with long-run average criterion

Optimal drift rate control and two-sided impulse control for a Brownian system with the long-run average criterion

Learning the Best Price and Ordering Policy under Fixed Costs and Ambiguous Demand

LP Formulations of Discrete Time Long-Run Average Optimal Control Problems: The NonErgodic Case

Linear programming formulations of deterministic infinite horizon optimal control problems in discrete time

Performance analysis of a reflected fluid production/inventory model

A Fluid EOQ Model with Markovian Environment

A Fluid EOQ Model with Markovian Environment

A Fluid EOQ Model with Markovian Environment

Group replacement policies for a repairable cold standby system with fixed lead times

Policy Iteration Algorithms for Zero-Sum Stochastic Differential Games with Long-Run Average Payoff Criteria

A make-to-stock production/inventory model with MAP arrivals and phase-type demands

Average optimal strategies for zero-sum Markov games with poorly known payoff function on one side

Illustrated review of convergence conditions of the value iteration algorithm and the rolling horizon procedure for average-cost MDPs

Structural Properties of Optimal Scheduling Policies for Wireless Data Transmission

Semi-Infinite Weighted Markov Decision Processes

Maximisation of long-run average profit through dynamic control of production and pricing in a manufacturing system

Comments on: Dynamic priority allocation via restless bandit marginal productivity indices

Semi-infinite weighted Markov decision processes with perturbation