Bootstrap by sequential resampling

C.Radhakrishna Rao,P.K Pathak,V.I Koltchinskii

doi:10.1016/s0378-3758(97)00041-4

Abstract

This paper examines resampling for bootstrap from a survey sampling point of view. Given an observed sample of size n, resampling for bootstrap involves n repeated trials of simple random sampling with replacement. From the point of view of information content, it is well known that simple random sampling with replacement does not result in samples that are equally informative (see Pathak (1964) Ann. Math. Statist. 35, 795–808; Biometrika 51, 185–193). This is due to the randomness in the number of distinct observations that occur in different bootstrap samples. We propose an alternative scheme of sampling sequentially (with replacement each time) until k distinct original observations appear. In such a scheme, the bootstrap sample size becomes random as it varies from sample to sample, but each sample has exactly the same number of distinct observations. We show that the choice of k = (1 − e −1) n∼0.632 n has some advantage, stemming from the observation made by Efron (1983, J. Am. Statist. Assoc. 78, 316–331) that the usual bootstrap samples are supported on approximately 0.632 n of the original data points. Using recent results on empirical processes, we show that main empirical characteristics of the sequential resampling bootstrap are asymptotically within the distance of order ∼n −3 4 of the corresponding characteristics of the usual bootstrap.

Full Text