Abstract

There are abundant practical examples where a sampling unit is associated with several variables, one of which is expensive to measure but the others can be obtained easily and cheaply. The “expensive” variable is either the variable of main interest or the response variable in the setting of a regression model. A few such examples were given in Chapter 1. For the sake of convenience, we refer to the “expensive” variable as the response variable in all cases and to the others as concomitant variables. In this chapter, we are concerned with the estimation of certain characteristics of the response variable and the estimation of the regression model. We deal with how concomitant variables can be used in RSS and the related regression analysis. The use of a single concomitant variable in RSS was briefly mentioned in Chapter 2. It was shown there the ranking mechanism using a single concomitant variable is consistent. In the current chapter, we concentrate on consistent ranking mechanisms using multiple concomitant variables. Two such ranking mechanisms are developed in this chapter: a multi-layer ranking mechanism and an adaptive ranking mechanism. The RSS with the multi-layer ranking mechanism, which is referred to as the multi-layer RSS, is discussed in Section 6.1. The multi-layer ranking mechanism is conceptually equivalent to a stratification of the space of the concomitant variables. The multi-layer RSS is particularly useful for the estimation of regression coefficients. The features of the multi-layer RSS are investigated through simulation studies. The RSS with the adaptive ranking mechanism, which is referred to as the adaptive RSS, is discussed in Section 6.2. In the adaptive RSS, the conditional expectation of the response variable given the concomitant variables is used as ranking criterion in an adaptive way. The conditional expectation is continually estimated and updated using the data already obtained and then used as ranking criterion for further sampling. The adaptive ranking mechanism is the best, at least asymptotically, if the major concern is the estimation of certain characteristics of the response variable such as its mean and quantiles. In Section 6.3, the estimation of the regression model and regression-estimates of the mean of the response variable in the context of RSS are discussed. It is argued that, for the estimation of the mean of the response variable, the RSS regression-estimate is better than the RSS sample mean as long as the response variable and the concomitant variables are moderately correlated. It is shown that, for the estimation of the regression coefficients, balanced RSS and SRS are asymptotically equivalent in terms of efficiency. For more efficient estimates of the regression coefficients, unbalanced RSS is in order. Section 6.4 takes up the design of unbalanced RSS for the regression analysis under the consideration of A-optimality, D-optimality and IMSE-optimality. A general method is discussed and details are provided for polynomial regression models. Some technical details are given in Section 6.5.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call