Abstract In this study, we proposed a new method for estimating the sensitivity of enterprises in Italy to the United Nation’s sustainable development goals at the provincial level using web-scraping data (a nonprobability sample) because this value is not surveyed by the Italian National Institute of Statistics. The proposed method used a probability sample to reduce the selection bias of estimates obtained from the nonprobability sample in the context of small area estimation and integrated nonprobability and probability samples using a double robust estimator that combined (i) propensity weighting to improve the representativeness of the nonprobability sample and (ii) a statistical model to predict the units that were not in the nonprobability sample. A bootstrap procedure for estimating variance was also proposed. To validate the proposed method, a Monte Carlo simulation was performed. Results showed that the proposed method allowed the correction of bias from the nonprobability sample while maintaining a good level of estimate reliability.
Read full abstract