Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity

Kenichiro Sato,Tatsushi Toda,Tatsuo Mano,Atsushi Iwata

doi:10.1186/s12874-021-01338-2

Kenichiro Sato, Tatsushi Toda + Show 2 more

Open Access

https://doi.org/10.1186/s12874-021-01338-2

Copy DOI

Abstract

BackgroundGoogle Trends (GT) is being used as an epidemiological tool to study coronavirus disease (COVID-19) by identifying keywords in search trends that are predictive for the COVID-19 epidemiological burden. However, many of the earlier GT-based studies include potential statistical fallacies by measuring the correlation between non-stationary time sequences without adjusting for multiple comparisons or the confounding of media coverage, leading to concerns about the increased risk of obtaining false-positive results. In this study, we aimed to apply statistically more favorable methods to validate the earlier GT-based COVID-19 study results.MethodsWe extracted the relative GT search volume for keywords associated with COVID-19 symptoms, and evaluated their Granger-causality to weekly COVID-19 positivity in eight English-speaking countries and Japan. In addition, the impact of media coverage on keywords with significant Granger-causality was further evaluated using Japanese regional data.ResultsOur Granger causality-based approach largely decreased (by up to approximately one-third) the number of keywords identified as having a significant temporal relationship with the COVID-19 trend when compared to those identified by Pearson or Spearman’s rank correlation-based approach. “Sense of smell” and “loss of smell” were the most reliable GT keywords across all the evaluated countries; however, when adjusted with their media coverage, these keyword trends did not Granger-cause the COVID-19 positivity trends (in Japan).ConclusionsOur results suggest that some of the search keywords reported as candidate predictive measures in earlier GT-based COVID-19 studies may potentially be unreliable; therefore, caution is necessary when interpreting published GT-based study results.

Highlights

Google Trends (GT) is being used as an epidemiological tool to study coronavirus disease (COVID-19) by identifying keywords in search trends that are predictive for the COVID-19 epidemiological burden
Based on the above analytical concerns for earlier studies, by using the vector autoregression (VAR) model [11,12,13], which is designed to deal with time-series data and is robust against weakness as observed in case of using correlation, we aim to identify statistically more reliable symptom keywords for which GT trends may be used as a predictive measure for future COVID-19 positivity trends, and to validate the earlier study results
COVID-19 data and Google Trends (GT) data were separately analyzed in nine different regions: Japan (JP) and eight English-speaking countries, namely, Australia (AU), Canada (CA), Great IE Ireland (Britain) (GB), Ireland (IE), JP Japan (India) (IN), Singapore (SG), United States (US), and South false discovery rate (FDR) False discovery rate (Africa) (ZA)

Summary

Introduction

Google Trends (GT) is being used as an epidemiological tool to study coronavirus disease (COVID-19) by identifying keywords in search trends that are predictive for the COVID-19 epidemiological burden. Many of the earlier GT-based studies include potential statistical fallacies by measuring the correlation between non-stationary time sequences without adjusting for multiple comparisons or the confounding of media coverage, leading to concerns about the increased risk of obtaining false-positive results. Pearson (or Spearman’s rank) correlation is often applied to assess the correlation between the time-series trends of COVID-19 cases/deaths and GT trends in symptom keywords without confirming the stationarity of these time series This is sometimes critically inappropriate in the context of time-series analyses because time-series data often contains unit-root and the correlation between such series often results in high coefficient value and t-statistics [14], and it can increase the likelihood of obtaining spurious correlations. Because COVID-19 and its symptoms have attracted intensive attention worldwide, the influence of media coverage on GT symptom keywords is inevitable [10, 15, 16], which has hardly been adjusted in a statistically favorable manner

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Research Methodology	Publication Date: Jul 18, 2021
Citations: 29	License type: open-access

R Discovery Prime

R Discovery Prime

Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Research Methodology

Lead the way for us

Similar Papers

Relative High Interest in Acne on the Internet: A Web-Based Comparison Using Google Trends
Hyun-Tae Shin ... Se-Won Park
Annals of Dermatology | VOL. 26
Hyun-Tae Shin, et. al.Hyun-Tae Shin ... Se-Won Park
01 Jan 2014
Annals of Dermatology | VOL. 26

(513) Search Trends Signal Increased Vasectomy Interest in States with Sparsity of Urologists After Overrule of Roe vs. Wade
R Patel ... K Watts
The Journal of Sexual Medicine | VOL. 20
R Patel, et. al.R Patel ... K Watts
22 May 2023
The Journal of Sexual Medicine | VOL. 20

Loss of Smell and Taste in 2013 European Patients With Mild to Moderate COVID-19.
Jerome R Lechien ... Stephane Hans
Annals of Internal Medicine | VOL. 173
Jerome R Lechien, et. al.Jerome R Lechien ... Stephane Hans
26 May 2020
Loss of Smell and Taste in 2013 European Patients With Mild to Moderate COVID-19.
Jerome R Lechien ... Stephane Hans

Population Interest in Information on Obesity, Nutrition, and Occupational Health and Its Relationship with the Prevalence of Obesity: An Infodemiological Study.
Liliana Melián-Fleitas ... Carmina Wanden-Berghe
Nutrients | VOL. 15
Liliana Melián-Fleitas, et. al.Liliana Melián-Fleitas ... Carmina Wanden-Berghe
29 Aug 2023
Nutrients | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Need of care in interpreting Google Trends-based COVID-19 infodemiological study results: potential risk of false-positivity

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Research Methodology