Abstract
It is known that the classical statistical models are based on the assumptions that the observations are obtained from samples drawn by simple random sampling with replacement (srswr) or equivalently the observations are independently and identically distributed (IID). As such the conventional formulae for standard statistical packages which implement these procedures are also based on IID assumptions. In practice, in large-scale surveys samples are generally selected using a complex sampling design, such as a stratified multistage sampling design and this implies a situation different from an IID setup. Again, in large-scale sample surveys the finite population is often considered as a sample from a superpopulation. Survey data are commonly used for analytic inference about model parameters such as mean, regression coefficients, cell probabilities, etc. The sampling design may entail the situation that the sample observations are no longer subject to the same superpopulation model as the complete finite population. Thus, even if the IID assumption may hold for the complete population, the same generally breaks down for sample observations. The inadequacy of IID assumption is well known in the sample survey literature. It has been known for a long time, for example, that the homogeneity which the population clusters generally exhibit tend to increase the variance of the sample estimator over that of the estimator under srswr assumption, and further estimates of this variance wrongly based on IID assumptions are generally biased downwards. In view of all these observations it is required to examine the effects of a true complex design on the variance of an estimator with reference to a srswr design or an IID model setup. Section 2.2 examines these effects, design effect, and misspecification effect of a complex design for estimation of a single parameter \(\theta \). The effect of a complex design on the confidence interval of \(\theta \) is considered in the next section. Section 2.4 extends the concepts in Sect. 2.2 to multiparameter case and thus defines multivariate design effect. Since estimation of variance of estimator of \(\theta , \hat{\theta }\) (covariance matrix when \(\theta \) is a vector of parameters) is of major interest in this chapter we consider different methods of estimation of variance of estimators, particularly nonlinear estimators in the subsequent section. The estimation procedures are very general; they do not depend on any distributional assumption and are therefore nonparametric in nature. Section 2.5.1 considers in detail a simple method of estimation of variance of a linear statistic. In Sects. 2.5.2–2.5.7 we consider Taylor series linearization procedure, random group (RG) method, balanced repeated replication (BRR), jackknife (JK) procedure, JK repeated replication, and bootstrap (BS) techniques of variance estimation. Lastly, we consider the effect of a complex survey design on a classical test statistic for testing a hypothesis regarding a covariance matrix.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.