Abstract
Two-phase sampling is often used for estimating a population total or mean when the cost per unit of collecting auxiliary variables, x, is much smaller than the cost per unit of measuring a characteristic of interest, y. In the first phase, a large sample s1 is drawn according to a specific sampling design p(s1), and auxiliary data x are observed for the units i∈s1. Given the first-phase sample s1, a second-phase sample s2 is selected from s1 according to a specified sampling design {p(s2∣s1) }, and (y, x) is observed for the units i∈s2. In some cases, the population totals of some components of x may also be known. Two-phase sampling is used for stratification at the second phase or both phases and for regression estimation. Horvitz–Thompson-type variance estimators are used for variance estimation. However, the Horvitz–Thompson (Horvitz & Thompson, J. Amer. Statist. Assoc. 1952) variance estimator in uni-phase sampling is known to be highly unstable and may take negative values when the units are selected with unequal probabilities. On the other hand, the Sen–Yates–Grundy variance estimator is relatively stable and non-negative for several unequal probability sampling designs with fixed sample sizes. In this paper, we extend the Sen–Yates–Grundy (Sen, J. Ind. Soc. Agric. Statist. 1953; Yates & Grundy, J. Roy. Statist. Soc. Ser. B 1953) variance estimator to two-phase sampling, assuming fixed first-phase sample size and fixed second-phase sample size given the first-phase sample. We apply the new variance estimators to two-phase sampling designs with stratification at the second phase or both phases. We also develop Sen–Yates–Grundy-type variance estimators of the two-phase regression estimators that make use of the first-phase auxiliary data and known population totals of some of the auxiliary variables.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have