Existence of Optimal Stopping Rules for Rewards Related to $S_n/n$

David Siegmund,Paul Feder,Gordon Simons

doi:10.1214/aoms/1177698248

Abstract

Let $\{X_n, n \geqq 1\}$ be a sequence of independent and identically distributed random variables with mean 0 and finite $\nu$th moment for some $\nu \geqq 2$. Let $S_n = \sum^n_1 X_i$. We observe the $X$'s sequentially and must decide when to stop sampling. If we stop at time $n$ we receive a reward of the form $h_n(S_n)$, and we are concerned with finding stopping rules $t$ which maximize our expected reward, $E\lbrack h_t(S_t)\rbrack$. In particular we are concerned with showing that the so-called "functional equation rule" (FER) (for a definition, see Section 2) is such a (optimal) rule. Chow and Robbins [2] have treated the reward sequence $n^{-1}S_n$ with $X_i = \pm 1$ each with probability $\frac{1}{2}$. Dvoretzky [4] considered reward sequences of the form $n^{-\alpha}S_n (\alpha > \frac{1}{2})$ under the assumptions $EX_i = 0, EX^2_i < \infty$. Teicher and Wolfowitz [7] considered sequences $c_nS_n^\beta (\beta = 1, 2)$ under the same assumptions on the distribution of $X_i$. They impose the conditions $c_n > 0, c^2_{n+1} \leqq c_{n+2}c_n, (n + 1)^\beta c_{n+1} \leqq n^\beta c_n$. We establish certain principles which allow us to relate certain reward sequences $h_n$ of a particularly simple form to others of a more complicated form and in the process conclude that the FER is optimal in the more complicated situation. Using the basic reward sequence $n^{-\alpha}|S_n|^\beta$ with $2\alpha > \beta > 0$ and assuming that $E|X_i|^{\max(2,\beta)} < \infty$, we examine the problem of optimality for reward sequences of the form $c_nS^+_n, c_n|S_n|^\beta, n^{-\alpha} \log|S_n|$, etc., where $c_1, c_2, \cdots$ are constants such that $\lim \sup_{n\rightarrow\infty}n^\alpha c_n < \infty$. It seems somewhat customary in optimal stopping problems to try to verify that the reward sequence (in our case $h_n(S_n))$ is majorized by a non-negative random variable with finite expectation, and then to appeal to one of a class of general theorems in which this regularity condition appears. In the problems we consider we have found it easier to disregard this possibility and to use a direct approach. We begin with a formal presentation of the problem and a development of preliminary results which show that the FER is optimal provided that it is a.s. finite. In later sections we develop and exploit machinery for relating two or more reward sequences and verify that the FER is a.s. finite for the basic reward sequence

Full Text