DIY google trends indicators in social sciences: A methodological note

Ivana Lolić,Marina Matošec,Petar Sorić

doi:10.1016/j.techsoc.2024.102477

Abstract

Various branches of the social sciences often require high-frequency measures of collective behavior. Examining peoples' Google search patterns enables uncovering otherwise hard-to-measure social trends or constructing different leading indicators. Google Trends (GT) data presents a unique and rich dataset that enables a deeper look at the popularity of peoples’ search interests. This paper delves into the main empirical approaches of constructing GT indicators, conducts a thorough methodological review, and elucidates the nuances and potentials of GT data. Specifically, the paper furnishes a step-by-step methodological guide for constructing GT indicators: from data pre-treatment to composite estimation using an array of machine learning techniques (principal component analysis, XGBoost, and dynamic factor models). We scrutinize the prevalent construction strategies of GT indicators, compare the use of keyword-based vs. category-based estimations, single indicators vs. composite indicators, and evaluate the significance of various data transformations. Illustrating the utility of our methodological framework through a case study, we create GT indicators for US retail trade and examine their forecasting accuracy in comparison to a benchmark autoregressive model. Our findings reveal that XGBoost yields the optimal test sample fit among composite indicators, while category-specific indicators like Shopping and Retail Trade exhibit worse predictive accuracy. As an additional contribution, we provide a detailed and user-friendly R code for replicating our analysis.

Full Text