This article evaluates the quality of data collection in individual-level desktop web tracking used in the social sciences and shows that the existing approaches face sampling issues, validity issues due to the lack of content-level data and their disregard for the variety of devices and long-tail consumption patterns as well as transparency and privacy issues. To overcome some of these problems, the article introduces a new academic web tracking solution, WebTrack, an open-source tracking tool maintained by a major European research institution, GESIS. The design logic, the interfaces, and the backend requirements for WebTrack are discussed, followed by a detailed examination of the strengths and weaknesses of the tool. Finally, using data from 1,185 participants, the article empirically illustrates how an improvement in data collection through WebTrack leads to innovative shifts in the use of tracking data. As WebTrack allows for collecting the content people are exposed to beyond the classical news platforms, it can greatly improve the detection of politics-related information consumption in tracking data through automated content analysis compared to traditional approaches that rely on the source-level analysis.
Read full abstract