Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Multi-Armed Bandits with Discount Factor Near One: The Bernoulli Case

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Each of $n$ arms generates an infinite sequence of Bernoulli random variables. The parameters of the sequences are themselves random variables, and are independent with a common distribution satisfying a mild regularity condition. At each stage we must choose an arm to observe (or pull) based on past observations, and our aim is to maximize the expected discounted sum of the observations. In this paper it is shown that as the discount factor approaches one the optimal policy tends to the rule of least failures, defined as follows: pull the arm which has incurred the least number of failures, or if this does not define an arm uniquely select from amongst the set of arms which have incurred the least number of failures an arm with the largest number of successes.

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.1016/s0167-7152(99)00009-7
Maximal inequalities for averages of i.i.d. and 2-exchangeable random variables
  • Jul 27, 1999
  • Statistics & Probability Letters
  • N Etemadi

Maximal inequalities for averages of i.i.d. and 2-exchangeable random variables

  • Research Article
  • Cite Count Icon 320
  • 10.1214/aop/1176995146
Approximation Thorems for Independent and Weakly Dependent Random Vectors
  • Feb 1, 1979
  • The Annals of Probability
  • Istvan Berkes + 1 more

In this paper we prove approximation theorems of the following type. Let $\{X_k, k \geqslant 1\}$ be a sequence of random variables with values in $\mathbb{R}^{d_k}, d_k \geqslant 1$ and let $\{G_k, k \geqslant 1\}$ be a sequence of probability distributions on $\mathbb{R}^{d_k}$ with characteristic functions $g_k$ respectively. If for each $k \geqslant 1$ the conditional characteristic function of $X_k$ given $X_1, \cdots, X_{k - 1}$ is close to $g_k$ and if $G_k$ has small tails, then there exists a sequence of independent random variables $Y_k$ with distribution $G_k$ such that $|X_k - Y_k|$ is small with large probability. As an application we prove almost sure invariance principles for sums of independent identically distributed random variables with values in $\mathbb{R}^d$ and for sums of $\phi$-mixing random variables with a logarithmic mixing rate.

  • Research Article
  • Cite Count Icon 49
  • 10.1017/s0021900200007257
The Strong Law of Large Numbers for Extended Negatively Dependent Random Variables
  • Dec 1, 2010
  • Journal of Applied Probability
  • Yiqing Chen + 2 more

A sequence of random variables is said to be extended negatively dependent (END) if the tails of its finite-dimensional distributions in the lower-left and upper-right corners are dominated by a multiple of the tails of the corresponding finite-dimensional distributions of a sequence of independent random variables with the same marginal distributions. The goal of this paper is to establish the strong law of large numbers for a sequence of END and identically distributed random variables. In doing so we derive some new inequalities of large deviation type for the sums of END and identically distributed random variables being suitably truncated. We also show applications of our main result to risk theory and renewal theory.

  • Research Article
  • Cite Count Icon 95
  • 10.1239/jap/1294170508
The Strong Law of Large Numbers for Extended Negatively Dependent Random Variables
  • Dec 1, 2010
  • Journal of Applied Probability
  • Yiqing Chen + 2 more

A sequence of random variables is said to be extended negatively dependent (END) if the tails of its finite-dimensional distributions in the lower-left and upper-right corners are dominated by a multiple of the tails of the corresponding finite-dimensional distributions of a sequence of independent random variables with the same marginal distributions. The goal of this paper is to establish the strong law of large numbers for a sequence of END and identically distributed random variables. In doing so we derive some new inequalities of large deviation type for the sums of END and identically distributed random variables being suitably truncated. We also show applications of our main result to risk theory and renewal theory.

  • Research Article
  • Cite Count Icon 100
  • 10.1090/s0002-9939-1960-0112190-3
A martingale inequality and the law of large numbers
  • Jan 1, 1960
  • Proceedings of the American Mathematical Society
  • Y S Chow

The original Kolmogorov's inequality [6] has been extended to a martingale inequality by Levy [8] and Ville [12] and later to a semimartingale inequality by Doob [3]. In this note we will extend (1) to a semi-martingale inequality which contains Doob's inequality as a special case. As Kolmogorov's inequality is the key to the proof of the of large for a sequence of independent random variables, we will use our inequality to prove a law of large numbers for a martingale, which will be shown to include the extensions of Kolmogorov's of large for independent random variables [7] made by Brunk [I], Chung [2], Kawata and Udagawa [5], and Prohorov [11], and for dependent random variables made by Levy [8] and Loeve [9]. In the following (W, F, P) will be a probability space, cl, c2, . . . a nonincreasing sequence of positive numbers, xl, x2, * * * a sequence of random variables, yk=XlX2+x2 ? * +Xk and Fk the Borel field generated by xi, x2, * * *, Xk for each k, and for a random variable z we put z+=max(z, 0).

  • Research Article
  • Cite Count Icon 184
  • 10.1016/0304-4149(82)90050-3
On a continuous analogue of the stochastic difference equation Xn=ρ Xn-1 + Bn
  • May 1, 1982
  • Stochastic Processes and their Applications
  • Stephen James Wolfe

On a continuous analogue of the stochastic difference equation Xn=ρ Xn-1 + Bn

  • Research Article
  • Cite Count Icon 6
  • 10.1090/tran/7034
Extended de Finetti theorems for boolean independence and monotone independence
  • Oct 24, 2017
  • Transactions of the American Mathematical Society
  • Weihua Liu

We construct several new spaces of quantum sequences and their quantum families of maps in the sense of Sołtan. The noncommutative distributional symmetries associated with these quantum maps are noncommutative versions of spreadability and partial exchangeability. Then, we study simple relations between these symmetries. We will focus on studying two kinds of noncommutative distributional symmetries: monotone spreadability and boolean spreadability. We provide an example of a spreadable sequence of random variables for which the usual unilateral shift is an unbounded map. As a result, it is natural to study bilateral sequences of random objects, which are indexed by integers, rather than unilateral sequences. At the end of the paper, we will show Ryll-Nardzewski type theorems for monotone independence and boolean independence: Roughly speaking, an infinite bilateral sequence of random variables is monotonically (boolean) spreadable if and only if the variables are identically distributed and monotone (boolean) with respect to the conditional expectation onto its tail algebra. For an infinite sequence of noncommutative random variables, boolean spreadability is equivalent to boolean exchangeability.

  • Research Article
  • Cite Count Icon 21
  • 10.1017/s0305004100056139
Subsequence principles for vector-valued random variables
  • Sep 1, 1979
  • Mathematical Proceedings of the Cambridge Philosophical Society
  • D J H Garling

1. Introduction. Révész(8) has shown that if (fn) is a sequence of random variables, bounded in L2, there exists a subsequence (fnk) and a random variable f in L2 such that converges almost surely whenever . Komlós(5) has shown that if (fn) is a sequence of random variables, bounded in L1, then there is a subsequence (A*) with the property that the Cesàro averages of any subsequence converge almost surely. Subsequently Chatterji(2) showed that if (fn) is bounded in LP (where 0 < p ≤ 2) then there is a subsequence (gk) = (fnk) and f in Lp such thatalmost surely for every sub-subsequence. All of these results are examples of subsequence principles: a sequence of random variables, satisfying an appropriate moment condition, has a subsequence which satisfies some property enjoyed by sequences of independent identically distributed random variables. Recently Aldous(1), using tightness arguments, has shown that for a general class of properties such a subsequence principle holds: in particular, the results listed above are all special cases of Aldous' principal result.

  • Research Article
  • Cite Count Icon 11
  • 10.1090/s0002-9939-1990-0990432-9
Bounds on the expectation of functions of martingales and sums of positive RVs in terms of norms of sums of independent random variables
  • Jan 1, 1990
  • Proceedings of the American Mathematical Society
  • Victor H De La Pe{Ña

Let $\left ( {{x_i}} \right )$ be a sequence of random variables. Let $\left ( {{w_i}} \right )$ be a sequence of independent random variables such that for each $i, {w_i}$, has the same distribution as ${x_i}$. If ${S_n} = {x_1} + {x_2} + \cdots + {x_n}$ is a martingale and $\Psi$ is a convex increasing function such that $\Psi \left ( {\sqrt x } \right )$ is concave on $[0,\infty )$ and $\Psi (0) = 0$ then, \[ E\Psi \left ( {{{\max }_{j \leq n}}\left | {\sum \limits _{i = 1}^j {{x_i}} } \right |} \right ) < CE\Psi \left ( {\left | {\sum \limits _{i = 1}^j {{w_i}} } \right |} \right )\] for a universal constant $C,(0 < C < \infty )$ independent of $\Psi ,n$, and $\left ( {{x_i}} \right )$. The same inequality holds if $\left ( {{x_i}} \right )$ is a sequence of nonnegative random variables and $\Psi$ is now any nondecreasing concave function on $[0,\infty )$ with $\Psi (0) = 0$. Interestingly, if $\Psi \left ( {\sqrt x } \right )$ is convex and $\Psi$ grows at most polynomially fast, the above inequality reverses. By comparing martingales to sums of independent random variables, this paper presents a one-sided approximation to the order of magnitude of expectations of functions of martingales. This approximation is best possible among all approximations depending only on the one-dimensional distribution of the martingale differences.

  • Research Article
  • Cite Count Icon 1
  • 10.3390/axioms13110739
Arbitrary Random Variables and Wiman’s Inequality
  • Oct 29, 2024
  • Axioms
  • Andriy Kuryliak + 2 more

We study the class of random entire functions given by power series, in which the coefficients are formed as the product of an arbitrary sequence of complex numbers and two sequences of random variables. One of them is the Rademacher sequence, and the other is an arbitrary complex-valued sequence from the class of sequences of random variables, determined by a certain restriction on the growth of absolute moments of a fixed degree from the maximum of the module of each finite subset of random variables. In the paper we prove sharp Wiman–Valiron’s type inequality for such random entire functions, which for given p∈(0;1) holds with a probability p outside some set of finite logarithmic measure. We also considered another class of random entire functions given by power series with coefficients, which, as above, are pairwise products of the elements of an arbitrary sequence of complex numbers and a sequence of complex-valued random variables described above. In this case, similar new statements about not improvable inequalities are also obtained.

  • Research Article
  • Cite Count Icon 216
  • 10.1111/j.2517-6161.1978.tb01653.x
Discrete Time Series Generated by Mixtures. I: Correlational and Runs Properties
  • Sep 1, 1978
  • Journal of the Royal Statistical Society Series B: Statistical Methodology
  • P A Jacobs + 1 more

Summary A broad but parametrically simple model for a stationary sequence of dependent discrete random variables is given and several submodels are discussed. The structure of the model is specified by the marginal distribution of the random variables and several other parameters. The sequence of random variables is formed by a probabilistic linear combination of i.i.d. discrete random variables and is in general not Markovian. Second-order joint moments and spectra are obtained for the model, as well as some properties for the lengths of runs. The special case of a process in which the variables take on only two values is considered; this binary process is useful as a model for the counting process in a discrete-time point process.

  • Research Article
  • Cite Count Icon 9
  • 10.1090/s0002-9939-98-04254-3
A maximal inequality for partial sums of finite exchangeable sequences of random variables
  • Jan 1, 1998
  • Proceedings of the American Mathematical Society
  • Alexander R Pruss

Let X1, X2,... , X2. be a finite exchangeable sequence of Banach space valued random variables, i.e., a sequence such that all joint distributions are invariant under permutations of the variables. We prove that there is an absolute constant c such that if S j-j=1 X_, then P( sup JISjll > A) A/c), 1 0. This generalizes an inequality of Montgomery-Smith and Latala for independent and identically distributed random variables. Our maximal inequality is apparently new even if X1,X2,... is an infinite exchangeable sequence of random variables. As a corollary of our result, we obtain a comparison inequality for tail probabilities of sums of arbitrary random variables over random subsets of the indices. Montgomery-Smith [8] and Latala [7] have independently proved that if X1,-... Xn are independent and identically distributed Banach space valued random variables, then (1) P( sup ZX, > AZ A/C I 0 and 1 < k < n, where c is an absolute constant. It is obvious that this cannot hold for arbitrary independent random variables; as MontgomerySmith [8] notes, we need only let k -= n = 2, X1 1 and X2 -1 to see this. Levy's inequality says that (1) also holds for arbitrary independent symmetric random variables Xi (not necessarily identically distributed). For positive random variables, (1) is trivial, of course. A natural and much-studied extension of the concept of independent and identically distributed random variables is that of exchangeable random variables. We say that a finite sequence X1, ... , Xn of (not necessarily independent) random variables is exchangeable if the n-tuples (XI, ... , Xn) and (X(I), ... , Xr(n)) both have the same distribution whenever ir is a permutation of [n] If{,... ,n}. Evidently an exchangeable sequence of independent random variables is precisely a sequence of independent and identically distributed random variables. Received by the editors August 2, 1996 and, in revised form, December 2, 1996. 1991 Mathematics Subject Classification. Primary 60E15.

  • Research Article
  • Cite Count Icon 24
  • 10.1016/j.jfa.2015.07.007
A noncommutative de Finetti theorem for boolean independence
  • Aug 3, 2015
  • Journal of Functional Analysis
  • Weihua Liu

A noncommutative de Finetti theorem for boolean independence

  • Research Article
  • Cite Count Icon 1
  • 10.3390/math11163494
Equivalent Conditions of Complete p-th Moment Convergence for Weighted Sum of ND Random Variables under Sublinear Expectation Space
  • Aug 13, 2023
  • Mathematics
  • Peiyu Sun + 2 more

We investigate the complete convergence for weighted sums of sequences of negative dependence (ND) random variables and p-th moment convergence for weighted sums of sequences of ND random variables under sublinear expectation space. Using moment inequality and truncation methods, we prove the equivalent conditions of complete convergence for weighted sums of sequences of ND random variables and p-th moment convergence for weighted sums of sequences of ND random variables under sublinear expectation space.

  • Research Article
  • Cite Count Icon 1
  • 10.3103/s1063454117030116
Comparison of numbers of records in the sequences of discrete and continuous random variables
  • Jul 1, 2017
  • Vestnik St. Petersburg University, Mathematics
  • V B Nevzorov

Different record achievements are fixed in many domains of human activities. This process very often happens with some rate of digitization (up to seconds, meters, or thousands of individuals) of the observed results. By the examples of exponential and geometrical distributions, it is shown how such a type of the transitions from continuous to discrete distributions may vary the numbers of the record values in the corresponding sequences of the random variables.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant