Abstract

AbstractPurchase data from retail chains can provide proxy measures of private household expenditure on items that are the most troublesome to collect in the traditional expenditure survey. Due to the inevitable coverage and selection errors, bias must exist in these proxy measures. Moreover, given the sheer amount of data, the bias completely dominates the variance. To investigate the potential of replacing costly and burdensome surveys by non-survey big-data sources, we propose an audit sampling inference approach, which does not require linking the audit sample and the big-data source at the individual level. It turns out that one is unable to reject a null hypothesis of unbiased big-data estimation at the chosen size, because the audit sampling variance is too large compared to the bias of the big-data estimate. For the same reason, audit sampling fails to yield a meaningful mean squared error estimate. We propose a novel accuracy measure that is generally applicable in such situations. This can provide a necessary part of the statistical argument for the uptake of non-survey big-data sources, in replacement of traditional survey sampling. An application to disaggregated food price indices is used to demonstrate the proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call