Articles published on coupon-collectors-problem
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
116 Search results
Sort by Recency
- Research Article
45
- 10.1239/aap/1331216649
- Mar 1, 2012
- Advances in Applied Probability
- Aristides V Doumas + 1 more
We develop techniques for computing the asymptotics of the first and second moments of the number TN of coupons that a collector has to buy in order to find all N existing different coupons as N → ∞. The probabilities (occurring frequencies) of the coupons can be quite arbitrary. From these asymptotics we obtain the leading behavior of the variance V[TN] of TN (see Theorems 3.1 and 4.4). Then, we combine our results with the general limit theorems of Neal in order to derive the limit distribution of TN (appropriately normalized), which, for a large class of probabilities, turns out to be the standard Gumbel distribution. We also give various illustrative examples.
- Research Article
3
- 10.1017/s0021900200008639
- Dec 1, 2011
- Journal of Applied Probability
- Weiyu Xu + 1 more
This paper presents an analysis of a generalized version of the coupon collector problem, in which the collector receives d coupons each run and chooses the least-collected coupon so far. In the asymptotic case when the number of coupons n goes to infinity, we show that, on average, (nlogn) / d + (n / d)(m − 1)log logn + O(mn) runs are needed to collect m sets of coupons. An exact algorithm is also developed for any finite case to compute the exact mean number of runs. Numerical examples are provided to verify our theoretical predictions.
- Research Article
16
- 10.1239/jap/1324046020
- Dec 1, 2011
- Journal of Applied Probability
- Weiyu Xu + 1 more
This paper presents an analysis of a generalized version of the coupon collector problem, in which the collector receives d coupons each run and chooses the least-collected coupon so far. In the asymptotic case when the number of coupons n goes to infinity, we show that, on average, (nlogn) / d + (n / d)(m − 1)log logn + O(mn) runs are needed to collect m sets of coupons. An exact algorithm is also developed for any finite case to compute the exact mean number of runs. Numerical examples are provided to verify our theoretical predictions.
- Research Article
12
- 10.1007/s11009-011-9247-6
- Aug 27, 2011
- Methodology and Computing in Applied Probability
- Sunil Abraham + 3 more
An {\it Omnibus Sequence} of length $n$ is one that has each possible "message" of length $k$ embedded in it as a subsequence. We study various properties of Omnibus Sequences in this paper, making connections, whenever possible, to the classical coupon collector problem.
- Research Article
5
- 10.1017/s0021900200007713
- Mar 1, 2011
- Journal of Applied Probability
- R T Smythe
We consider a generalized form of the coupon collection problem in which a random number,S, of balls is drawn at each stage from an urn initially containingnwhite balls (coupons). Each white ball drawn is colored red and returned to the urn; red balls drawn are simply returned to the urn. The question considered is then: how many white balls (uncollected coupons) remain in the urn after thekndraws? Our analysis is asymptotic asn→ ∞. We concentrate on the case whenkndraws are made, wherekn/n→ ∞ (the superlinear case), although we sketch known results for other ranges ofkn. A Gaussian limit is obtained via a martingale representation for the lower superlinear range, and a Poisson limit is derived for the upper boundary of this range via the Chen-Stein approximation.
- Research Article
13
- 10.1239/jap/1300198144
- Mar 1, 2011
- Journal of Applied Probability
- R T Smythe
We consider a generalized form of the coupon collection problem in which a random number, S, of balls is drawn at each stage from an urn initially containing n white balls (coupons). Each white ball drawn is colored red and returned to the urn; red balls drawn are simply returned to the urn. The question considered is then: how many white balls (uncollected coupons) remain in the urn after the kn draws? Our analysis is asymptotic as n → ∞. We concentrate on the case when kn draws are made, where kn / n → ∞ (the superlinear case), although we sketch known results for other ranges of kn. A Gaussian limit is obtained via a martingale representation for the lower superlinear range, and a Poisson limit is derived for the upper boundary of this range via the Chen-Stein approximation.
- Research Article
10
- 10.1016/j.jspi.2011.01.020
- Feb 2, 2011
- Journal of Statistical Planning and Inference
- Jelena Jocković + 1 more
Coupon collector's problem and generalized Pareto distributions
- Research Article
12
- 10.1239/aap/1293113148
- Dec 1, 2010
- Advances in Applied Probability
- Hosam M Mahmoud
In this paper we consider a generalized coupon collection problem in which a customer repeatedly buys a random number of distinct coupons in order to gather a large number n of available coupons. We address the following question: How many different coupons are collected after k = kn draws, as n → ∞? We identify three phases of kn: the sublinear, the linear, and the superlinear. In the growing sublinear phase we see o(n) different coupons, and, with true randomness in the number of purchases, under the appropriate centering and scaling, a Gaussian distribution is obtained across the entire phase. However, if the number of purchases is fixed, a degeneracy arises and normality holds only at the higher end of this phase. If the number of purchases have a fixed range, the small number of different coupons collected in the sublinear phase is upgraded to a number in need of centering and scaling to become normally distributed in the linear phase with a different normal distribution of the type that appears in the usual central limit theorems. The Gaussian results are obtained via martingale theory. We say a few words in passing about the high probability of collecting nearly all the coupons in the superlinear phase. It is our aim to present the results in a way that explores the critical transition at the ‘seam line’ between different Gaussian phases, and between these phases and other nonnormal phases.
- Research Article
3
- 10.1017/s0001867800004493
- Dec 1, 2010
- Advances in Applied Probability
- Hosam M Mahmoud
In this paper we consider a generalized coupon collection problem in which a customer repeatedly buys a random number of distinct coupons in order to gather a large number n of available coupons. We address the following question: How many different coupons are collected after k = k n draws, as n → ∞? We identify three phases of k n : the sublinear, the linear, and the superlinear. In the growing sublinear phase we see o(n) different coupons, and, with true randomness in the number of purchases, under the appropriate centering and scaling, a Gaussian distribution is obtained across the entire phase. However, if the number of purchases is fixed, a degeneracy arises and normality holds only at the higher end of this phase. If the number of purchases have a fixed range, the small number of different coupons collected in the sublinear phase is upgraded to a number in need of centering and scaling to become normally distributed in the linear phase with a different normal distribution of the type that appears in the usual central limit theorems. The Gaussian results are obtained via martingale theory. We say a few words in passing about the high probability of collecting nearly all the coupons in the superlinear phase. It is our aim to present the results in a way that explores the critical transition at the ‘seam line’ between different Gaussian phases, and between these phases and other nonnormal phases.
- Research Article
115
- 10.1109/tit.2010.2095111
- Nov 15, 2010
- IEEE Transactions on Information Theory
- Yao Li + 2 more
To reduce computational complexity and delay in randomized network coded content distribution, and for some other practical reasons, coding is not performed simultaneously over all content blocks, but over much smaller, possibly overlapping subsets of these blocks, known as generations. A penalty of this strategy is throughput reduction. To analyze the throughput loss, we model coding over generations with random generation scheduling as a coupon collector's brotherhood problem. This model enables us to derive the expected number of coded packets needed for successful decoding of the entire content as well as the probability of decoding failure (the latter only when generations do not overlap) and further, to quantify the tradeoff between computational complexity and throughput. Interestingly, with a moderate increase in the generation size, throughput quickly approaches link capacity. Overlaps between generations can further improve throughput substantially for relatively small generation sizes.
- Research Article
14
- 10.1109/jsac.2010.100922
- Sep 1, 2010
- IEEE Journal on Selected Areas in Communications
- Weiyao Xiao + 1 more
The advent of practical rateless codes enables implementation of highly efficient packet-level forward error correction (FEC) strategies for reliable data broadcasting in loss-prone wireless networks, such as sensor networks. Yet, the critical question of accurately quantifying the proper amount of redundancy has remained largely unsolved. In this paper, we exploit advances in extreme value theory to rigorously address this problem. Under the asymptotic regime of a large number of receivers, we derive a closed-form expression for the cumulative distribution function (CDF) of the completion time of file distribution. We show the existence of a phase transition associated with this CDF and accurately locate the transition point. We derive tight convergence bounds demonstrating the accuracy of the asymptotic estimate for the practical case of a finite number of receivers. Further, we asymptotically characterize the CDF of the completion time under heterogeneous packet loss, by establishing a close relationship between the data broadcasting and multi-set coupon collector problems. We demonstrate the benefits of our approach through simulation and through real experiments on a Tmote Sky sensor testbed. Specifically, we augment the existing Rateless Deluge software dissemination protocol with an extreme value FEC strategy. The experimental results reveal reduction by a factor of five in retransmission request messages and by a factor of two in total dissemination time, at the cost of a marginally higher number of data packet transmissions in the order of 5%.
- Research Article
8
- 10.1239/jap/1245676108
- Jun 1, 2009
- Journal of Applied Probability
- Anna Pósfai
In this paper we refine a Poisson limit theorem of Gnedenko and Kolmogorov (1954): we determine the error order of a Poisson approximation for sums of asymptotically negligible integer-valued random variables that converge in distribution to the Poisson law. As an application of our results, we investigate the case of the coupon collector's problem when the distribution of the collector's waiting time is asymptotically Poisson.
- Research Article
- 10.1017/s0021900200005660
- Jun 1, 2009
- Journal of Applied Probability
- Anna Pósfai
In this paper we refine a Poisson limit theorem of Gnedenko and Kolmogorov (1954): we determine the error order of a Poisson approximation for sums of asymptotically negligible integer-valued random variables that converge in distribution to the Poisson law. As an application of our results, we investigate the case of the coupon collector's problem when the distribution of the collector's waiting time is asymptotically Poisson.
- Research Article
6
- 10.1080/08982110802642555
- Mar 13, 2009
- Quality Engineering
- Stephen N Luko
ABSTRACT This article examines the classical “coupon collector's problem” and applies several key results to certain problems in quality control sampling. It is shown how solutions to the coupon sampling problem are readily adaptable to industrial sampling problems where the intent is to sample at least 1 of k specific product unit types in a well mixed stream of product where the product unit is equally likely to be any of the k specific types. Several variations on this theme are presented. Theoretical formulas are developed and applied to quality control sampling. Additional applications involving Monte Carlo simulation using the freeware programming language “R” are also illustrated.
- Research Article
10
- 10.1109/tpds.2008.47
- Jan 1, 2009
- IEEE Transactions on Parallel and Distributed Systems
- S.R Kundu + 3 more
In networks carrying large volume of traffic, accurate traffic characterization is necessary for understanding the dynamics and patterns of network resource usage. Previous approaches to flow characterization are based on random sampling of the packets (e.g., Cisco's NetFlow) or inferring characteristics solely based on long lived flows (LLFs) or on lossy data structures (e.g., bloom filters, hash tables). However, none of these approaches takes into account the heavy-tailed nature of the Internet traffic and separates the estimation algorithm from the flow measurement architecture.In this paper, we propose an alternate approach to traffic characterization by closely linking the flow measurement architecture with the estimation algorithm. Our measurement framework stores complete information related to short lived flows (SLFs) while collecting partial information related to LLFs. For real-time separation of LLFs and SLFs, we propose a novel algorithm based on typical sequences from information theory. The distribution (pdf) and sample space of the underlying traffic is estimated using the non-parametric Parzen window technique and likelihood function defined over the Coupon collector problem. We validate the accuracy and performance of our estimation technique using traffic traces from the internal LAN in our laboratory and from National Library for Applied Network Research (NLANR).
- Research Article
16
- 10.1007/s10817-008-9113-6
- Nov 1, 2008
- Journal of Automated Reasoning
- Osman Hasan + 1 more
Statistical quantities, such as expectation (mean) and variance, play a vital role in the present age probabilistic analysis. In this paper, we present some formalization of expectation theory that can be used to verify the expectation and variance characteristics of discrete random variables within the HOL theorem prover. The motivation behind this is the ability to perform error free probabilistic analysis, which in turn can be very useful for the performance and reliability analysis of systems used in safety-critical domains, such as space travel, medicine and military. We first present a formal definition of expectation of a function of a discrete random variable. Building upon this definition, we formalize the mathematical concept of variance and verify some classical properties of expectation and variance in HOL. We then utilize these formal definitions to verify the expectation and variance characteristics of the Geometric random variable. In order to demonstrate the practical effectiveness of the formalization presented in this paper, we also present the probabilistic analysis of the Coupon Collector's problem in HOL.
- Research Article
8
- 10.1016/j.comcom.2008.09.006
- Sep 12, 2008
- Computer Communications
- Bin Wu + 1 more
Modeling message propagation in random graph networks
- Research Article
17
- 10.1017/s0021900200004599
- Sep 1, 2008
- Journal of Applied Probability
- Peter Neal
Coupons are collected one at a time from a population containing n distinct types of coupon. The process is repeated until all n coupons have been collected and the total number of draws, Y, from the population is recorded. It is assumed that the draws from the population are independent and identically distributed (draws with replacement) according to a probability distribution X with the probability that a type-i coupon is drawn being P(X = i). The special case where each type of coupon is equally likely to be drawn from the population is the classic coupon collector problem. We consider the asymptotic distribution Y (appropriately normalized) as the number of coupons n → ∞ under general assumptions upon the asymptotic distribution of X. The results are proved by studying the total number of coupons, W(t), not collected in t draws from the population and noting that P(Y ≤ t) = P(W(t) = 0). Two normalizations of Y are considered, the choice of normalization depending upon whether or not a suitable Poisson limit exists for W(t). Finally, extensions to the K-coupon collector problem and the birthday problem are given.
- Research Article
70
- 10.1239/jap/1222441818
- Sep 1, 2008
- Journal of Applied Probability
- Peter Neal
Coupons are collected one at a time from a population containingndistinct types of coupon. The process is repeated until allncoupons have been collected and the total number of draws,Y, from the population is recorded. It is assumed that the draws from the population are independent and identically distributed (draws with replacement) according to a probability distributionXwith the probability that a type-icoupon is drawn being P(X=i). The special case where each type of coupon is equally likely to be drawn from the population is the classic coupon collector problem. We consider the asymptotic distributionY(appropriately normalized) as the number of couponsn→ ∞ under general assumptions upon the asymptotic distribution ofX. The results are proved by studying the total number of coupons,W(t), not collected intdraws from the population and noting that P(Y≤t) = P(W(t) = 0). Two normalizations ofYare considered, the choice of normalization depending upon whether or not a suitable Poisson limit exists forW(t). Finally, extensions to theK-coupon collector problem and the birthday problem are given.
- Research Article
10
- 10.37236/906
- Aug 18, 2008
- The Electronic Journal of Combinatorics
- Russell May
We analyze a variant of the coupon collector's problem, in which the probabilities of obtaining coupons and the numbers of coupons in a collection may be non-uniform. We obtain a finite expression for the generating function of the probabilities to complete a collection and show how this generalizes several previous results about the coupon collector's problem. Also, we provide applications about computational complexity and approximation.