Abstract

SUMMARY The problem of estimation in multipurpose sample surveys is treated from the prediction theory viewpoint. In particular the interaction of sample imbalance and collinearity among the survey pre- dictor variables is explored and a estimation technique is developed which optimises the tradeoff between bias and variance for this situation. An application to a survey carried out by the Bureau of Agricultural Economics is given. The survey statistician who uses a super-population model approach to estimation, either through preference or because of defective randomization of the sample, must always be conscious of the danger of a misspecified model. This danger is particularly acute with multipurpose surveys for which means or totals of a large number of variables are to be estimated, and for which a full specification search for the model underlying each variable is not practicable, Royall and Herson (1973) have shown how balanced sampling can give protection from model misspecification. Balance can be achieved by randomization, by controlled selection, or by purposive sampling, but there are many circumstances where proper control of the sample selection process cannot be exercised, and an unbalanced sample may result. Given an unbalanced sample and the need to use a model based estimator without an exhaustive specification check, the statistician faces a dilemma. If relevant predictors are omitted from the model, then with an unbalanced sample substantial bias may result. On the other hand, if too many predictors are included the model will be overspecified and inefficient survey estimates will result (see Rao 1971). In addition, these predictors can be related to one another, so there may be multicollinearity. In this case the estimates, though unbiased, will be unstable, and may be of the wrong sign. There is also, of course, the question of the correct functional form, but if it is assumed that the functional form can be approximated by a polynomial then this can be reduced to a question of inclusion of variables in a linear model. An estimation strategy for use in such unbalanced situations is outlined in the following section. It is interesting to note that this strategy leads naturally to the use of ridge type estimators for the survey variables. Ridge estimators were originally proposed by Hoerl and Kennard (1970) in the context of parametric estimation for linear regression models. It is an indication of the flexibility inherent in the super-population approach to finite population estimation that these ideas can also be put to good use in survey sampling. The paper concludes with an application of the proposed strategy to estimation in an annual survey of Australian farms carried out by the Bureau of Agricultural Economics (BAE).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.