Abstract

We document a new source of consumer price microdata. The new database allows researchers studying consumer price behaviour to access current and granular raw statistical observations. The range of observed prices fully covers goods and services of the Rosstat’s CPI sample and extends beyond it. In this paper, we pursue two objectives. First, we describe the data collection mechanism, data structure, and their access protocols, as well provide four complete illustrations of their application using open API: i) training machine models of product classification based on text labels, ii) real-time tracking of product prices, iii) estimating hedonic regressions for product groups, and iv) calculating arbitrary analytical price indices. Second, we share a set of basic skills and technologies for the benefit of researchers interested in creating their own sources of alternative data.

Highlights

  • We document a new source of consumer price microdata

  • The efforts to turn new data sources into sets of interpretable indicators are combined within the research area of alternative data viewed in the broadest sense of the term

  • We describe the access mode and protocol for researchers and potential participants in the price monitoring project, as the data are open to interested researchers, with access provided via application programming interface (API) described below

Read more

Summary

Introduction: alternative sources of price data

The researcher’s contribution has been the analysis. The data inputs are usually provided by national statistical institutes, international organisations, or dedicated statistical data providers, such as financial market trading platforms. The consumer price data source that we have created and documented in this paper relies on web scraping and meets both the broad and narrow definitions of alternative data. These data are supplementary to official statistics. March 2021 text labels, b) real-time price level tracking for individual goods or category of goods, such as during the periods of changes in regulatory policy or significant supply shocks, c) design of a prototype hedonic regression for a class of goods with a short technological cycle, d) calculation of arbitrary price indices, such as the food price index

Technologies for collecting price statistics
Comparing alternative price observation technologies
Web scraping technologies
Dataset description and access protocol
Dataset description
Data availability and principles of working with data
Machine learning and classification of goods
Real-time monitoring of prices
Proto-hedonic regression
Building price indices
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.