A system manager dynamically controls a diffusion process Z that lives in a finite interval [0,b]. Control takes the form of a negative drift rate θ that is chosen from a fixed set A of available values. The controlled process evolves according to the differential relationship dZ=dX−θ(Z) dt+dL−dU, where X is a (0,σ) Brownian motion, and L and U are increasing processes that enforce a lower reflecting barrier at Z=0 and an upper reflecting barrier at Z=b, respectively. The cumulative cost process increases according to the differential relationship dξ=c(θ(Z)) dt+p dU, where c(⋅) is a nondecreasing cost of control and p>0 is a penalty rate associated with displacement at the upper boundary. The objective is to minimize long-run average cost. This problem is solved explicitly, which allows one to also solve the following, essentially equivalent formulation: minimize the long-run average cost of control subject to an upper bound constraint on the average rate at which U increases. The two special problem features that allow an explicit solution are the use of a long-run average cost criterion, as opposed to a discounted cost criterion, and the lack of state-related costs other than boundary displacement penalties. The application of this theory to power control in wireless communication is discussed.
Read full abstract