Abstract
Linked survey and Twitter data present an unprecedented opportunity for social scientific analysis, but the ethical implications for such work are complex—requiring a deeper understanding of the nature and composition of Twitter data to fully appreciate the risks of disclosure and harm to participants. In this article, we draw on our experience of three recent linked data studies, briefly discussing the background research on data linkage and the complications around ensuring informed consent. Particular attention is paid to the vast array of data available from Twitter and in what manner it might be disclosive. In light of this, the issues of maintaining security, minimizing risk, archiving, and reuse are applied to linked Twitter and survey data. In conclusion, we reflect on how our ability to collect and work with Twitter data has outpaced our technical understandings of how the data are constituted and observe that understanding one’s data is an essential prerequisite for ensuring best ethical practice.
Highlights
The advent of social media has provided researchers with a potentially rich source of information regarding the behaviors, attitudes, and beliefs of individuals (Sloan, Morgan, Burnap, & Williams, 2015), but with it has come the substantial but necessary task of reconceptualizing some of the standard practices of ethical social research—voluntary participation, minimizing harm, and maximizing value
What is to be done when Twitter data are linked with survey data where the data are not public and we would otherwise aim for anonymity?
Given the complexity of linking survey and Twitter data, the resulting difficulty in achieving truly “informed” consent, and the inability to consistently rely on anonymization of data to protect participants from risk of harm, increased emphasis should be placed on maintaining the security of data throughout the research process
Summary
The advent of social media has provided researchers with a potentially rich source of information regarding the behaviors, attitudes, and beliefs of individuals (Sloan, Morgan, Burnap, & Williams, 2015), but with it has come the substantial but necessary task of reconceptualizing some of the standard practices of ethical social research—voluntary participation (informed consent), minimizing harm (disclosure control and security), and maximizing value (archiving). The constant rerecording of the same attributes does open up new avenues for research as we can see how perhaps initially unimportant things change over time, such as how users describe themselves or how their follower and followee numbers change in light of their behavior This comprehensive repeated snapshot is precisely why Twitter data can add value to survey studies through data linkage, but it presents a substantial problem in that many researchers may not be aware of the extent of information they are collecting about an individual and what attributes could compromise respondent anonymity. It is based on processes developed for securely linking survey and administrative data (Administrative Data Research Network, 2018), reflecting the principle of “systematic processing.” Initially, the data collected from the survey will include a unique ID, the survey data, and the
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have