Abstract

BackgroundHuman immunodeficiency virus (HIV) is a serious health problem in the Russian Federation. However, the true scale of HIV in Russia has long been the subject of considerable debate. Using digital surveillance to monitor diseases has become increasingly popular in high income countries. But Internet users may not be representative of overall populations, and the characteristics of the Internet-using population cannot be directly ascertained from search pattern data. This exploratory infoveillance study examined if Internet search patterns can be used for disease surveillance in a large middle-income country with a dispersed population.ObjectiveThis study had two main objectives: (1) to validate Internet search patterns against national HIV prevalence data, and (2) to investigate the relationship between search patterns and the determinants of Internet access.MethodsWe first assessed whether online surveillance is a valid and reliable method for monitoring HIV in the Russian Federation. Yandex and Google both provided tools to study search patterns in the Russian Federation. We evaluated the relationship between both Yandex and Google aggregated search patterns and HIV prevalence in 2011 at national and regional tiers. Second, we analyzed the determinants of Internet access to determine the extent to which they explained regional variations in searches for the Russian terms for “HIV” and “AIDS”. We sought to extend understanding of the characteristics of Internet searching populations by data matching the determinants of Internet access (age, education, income, broadband access price, and urbanization ratios) and searches for the term “HIV” using principal component analysis (PCA).ResultsWe found generally strong correlations between HIV prevalence and searches for the terms “HIV” and “AIDS”. National correlations for Yandex searches for “HIV” were very strongly correlated with HIV prevalence (Spearman rank-order coefficient [rs]=.881, P≤.001) and strongly correlated for “AIDS” (rs=.714, P≤.001). The strength of correlations varied across Russian regions. National correlations in Google for the term “HIV” (rs=.672, P=.004) and “AIDS” (rs=.584, P≤.001) were weaker than for Yandex. Second, we examined the relationship between the determinants of Internet access and search patterns for the term “HIV” across Russia using PCA. At the national level, we found Principal Component 1 loadings, including age (-0.56), HIV search (-0.533), and education (-0.479) contributed 32% of the variance. Principal Component 2 contributed 22% of national variance (income, -0.652 and broadband price, -0.460).ConclusionsThis study contributes to the methodological literature on search patterns in public health. Based on our preliminary research, we suggest that PCA may be used to evaluate the relationship between the determinants of Internet access and searches for health problems beyond high-income countries. We believe it is in middle-income countries that search methods can make the greatest contribution to public health.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call