Clarifications and New Insights on Conditional Bias

Gilles Bourgault

doi:10.1007/s11004-020-09853-6

Abstract

This study revisits the conditional bias that can be observed with spatial estimators such as kriging. In the geostatistical literature, the term “conditional bias” has been used to describe two different effects: underestimation of high values and overestimation of low values, or the opposite, viz. overestimation of high values and underestimation of low values. To add to the confusion, the smoothing effect of the estimator is always indicated to be the culprit. It seems that geostatisticians have been debating conditional bias since the birth of geostatistics. Is less or more smoothing required to alleviate conditional bias, and which one? This paradox is actually resolved when one considers the different distribution partitions on which conditional expectation can be calculated. Depending on the partitions of the bivariate distribution of true versus estimated values, conditional expectation can be calculated on conditional or marginal distributions. These lead to different types of conditional bias, and smoothing affects them differently. The type based on conditional distributions is smoothing friendly, while the type based on marginal distributions is smoothing adverse. The same estimator can display under- and overestimation, depending on whether a conditional or marginal distribution is considered. It is also observed that all conditional biases, regardless of the bivariate distribution partitions, are greatly affected by the variance of the conditioning data and vary with the sampling. A simple estimator correction can be applied to exactly remove the smoothing-friendly conditional bias in the sample as measured by the slope of the linear regression between the true and estimated values in cross-validation. Over many samplings, it is observed that this cross-validation measure is itself conditionally biased, depending on the variance of the data. On the other hand, the smoothing-adverse type of conditional bias can be corrected by conditional simulation that reproduces the distribution of the data. The results are also biased, depending on the variance of the conditioning data. Correcting for the smoothing-adverse type will worsen the smoothing-friendly type, and vice versa. Both types of conditional bias can be corrected by averaging statistics, or averaging estimates, over multiple samplings.

Full Text