Abstract

Various branches of the social sciences often require high-frequency measures of collective behavior. Examining peoples' Google search patterns enables uncovering otherwise hard-to-measure social trends or constructing different leading indicators. Google Trends (GT) data presents a unique and rich dataset that enables a deeper look at the popularity of peoples’ search interests. This paper delves into the main empirical approaches of constructing GT indicators, conducts a thorough methodological review, and elucidates the nuances and potentials of GT data. Specifically, the paper furnishes a step-by-step methodological guide for constructing GT indicators: from data pre-treatment to composite estimation using an array of machine learning techniques (principal component analysis, XGBoost, and dynamic factor models). We scrutinize the prevalent construction strategies of GT indicators, compare the use of keyword-based vs. category-based estimations, single indicators vs. composite indicators, and evaluate the significance of various data transformations. Illustrating the utility of our methodological framework through a case study, we create GT indicators for US retail trade and examine their forecasting accuracy in comparison to a benchmark autoregressive model. Our findings reveal that XGBoost yields the optimal test sample fit among composite indicators, while category-specific indicators like Shopping and Retail Trade exhibit worse predictive accuracy. As an additional contribution, we provide a detailed and user-friendly R code for replicating our analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call