Abstract

Consider a process $X(\cdot) = {X(t), 0 \leq t < \infty}$ which takes values in the interval $I = (0, 1)$, satisfies a stochastic differential equation $$dX(t) = \beta(t)dt + \sigma(t)dW(t), X(0) = x \epsilon I$$ and, when it reaches an endpoint of the interval I, it is absorbed there. Suppose that the parameters $\beta$ and $\sigma$ are selected by a controller at each instant $t \epsilon [0, \infty)$ from a set depending on the current position. Assume also that the controller selects a stopping time $\tau$ for the process and seeks to maximize $\mathbf{E}u(X(\tau))$, where $u: [0, 1] \to \Re$ is a continuous reward function. If $\lambda := \inf{x \epsilon I: u(x) = \max u}$ and $\rho := \sup{x \epsilon I: u(x) = \max u}$, then, to the left of $\lambda$, it is best to maximize the mean-variance ratio $(\beta/\sigma^2)$ or to stop, and to the right of $\rho$, it is best to minimize the ratio $(\beta/\sigma^2)$ or to stop. Between $\lambda$ and $\rho$, it is optimal to follow any policy that will bring the process $X(\cdot)$ to a point of maximum for the function $u(\cdot)$ with probability 1, and then stop.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.