Pseudonymization for research data collection: is the juice worth the squeeze?

Florian Kohlmayer,Ronald Lautenschläger,Fabian Prasser

doi:10.1186/s12911-019-0905-x

Abstract

BackgroundThe collection of data and biospecimens which characterize patients and probands in-depth is a core element of modern biomedical research. Relevant data must be considered highly sensitive and it needs to be protected from unauthorized use and re-identification. In this context, laws, regulations, guidelines and best-practices often recommend or mandate pseudonymization, which means that directly identifying data of subjects (e.g. names and addresses) is stored separately from data which is primarily needed for scientific analyses.DiscussionWhen (authorized) re-identification of subjects is not an exceptional but a common procedure, e.g. due to longitudinal data collection, implementing pseudonymization can significantly increase the complexity of software solutions. For example, data stored in distributed databases, need to be dynamically combined with each other, which requires additional interfaces for communicating between the various subsystems. This increased complexity may lead to new attack vectors for intruders. Obviously, this is in contrast to the objective of improving data protection. What is lacking is a standardized process of evaluating and reporting risks, threats and countermeasures, which can be used to test whether integrating pseudonymization methods into data collection systems actually improves upon the degree of protection provided by system designs that simply follow common IT security best practices and implement fine-grained role-based access control models. To demonstrate that the methods used to describe systems employing pseudonymized data management are currently heterogeneous and ad-hoc, we examined the extent to which twelve recent studies address each of the six basic security properties defined by the International Organization for Standardization (ISO) standard 27,000. We show inconsistencies across the studies, with most of them failing to mention one or more security properties.ConclusionWe discuss the degree of privacy protection provided by implementing pseudonymization into research data collection processes. We conclude that (1) more research is needed on the interplay of pseudonymity, information security and data protection, (2) problem-specific guidelines for evaluating and reporting risks, threats and countermeasures should be developed and that (3) future work on pseudonymized research data collection should include the results of such structured and integrated analyses.

Highlights

The collection of data and biospecimens which characterize patients and probands in-depth is a core element of modern biomedical research
We conclude that (1) more research is needed on the interplay of pseudonymity, information security and data protection, (2) problem-specific guidelines for evaluating and reporting risks, threats and countermeasures should be developed and that (3) future work on pseudonymized research data collection should include the results of such structured and integrated analyses
The collection of fine-grained personal health data has become an important element of biomedical research, which is required to obtain characterizations of patients and probands in necessary breadth and depth

Summary

Discussion

When (authorized) re-identification of subjects is not an exceptional but a common procedure, e.g. due to longitudinal data collection, implementing pseudonymization can significantly increase the complexity of software solutions. Data stored in distributed databases, need to be dynamically combined with each other, which requires additional interfaces for communicating between the various subsystems. This increased complexity may lead to new attack vectors for intruders. This is in contrast to the objective of improving data protection. What is lacking is a standardized process of evaluating and reporting risks, threats and countermeasures, which can be used to test whether integrating pseudonymization methods into data collection systems improves upon the degree of protection provided by system designs that follow common IT security best practices and implement fine-grained role-based access control models. We show inconsistencies across the studies, with most of them failing to mention one or more security properties

Conclusion

Background

Literature review

Results

Accountability “Responsibility of an entity for its actions and decisions”

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Sep 4, 2019
Citations: 9	License type: open-access

R Discovery Prime

R Discovery Prime

Pseudonymization for research data collection: is the juice worth the squeeze?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

The State of Information Security Law: A Focus on the Key Legal Trends
Thomas J Smedinghoff
EDPACS | VOL. 37
Thomas J SmedinghoffThomas J Smedinghoff
16 Jan 2008
EDPACS | VOL. 37

Manual for Reporting on Zoonoses, Zoonotic Agents and Antimicrobial Resistance in the framework of Directive 2003/99/EC and on some other pathogenic microbiological agents for information derived from the year 2012
-
EFSA Supporting Publications | VOL. 10
--
01 Apr 2013
EFSA Supporting Publications | VOL. 10

Manual for reporting on zoonoses and zoonotic agents, within the framework of Directive 2003/99/EC, and on some other pathogenic microbiological agents for information deriving from the year 2016
-
EFSA Supporting Publications | VOL. 14
--
01 Jan 2017
EFSA Supporting Publications | VOL. 14

Using a Mobile Messenger Service as a Digital Diary to Capture Patients' Experiences Along Their Interorganizational Treatment Path in Gynecologic Oncology: Lessons Learned.
Eleonore Baum ... Antje Koller
JMIR cancer | VOL. 10
Eleonore Baum, et. al.Eleonore Baum ... Antje Koller
21 Sep 2023
JMIR cancer | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Pseudonymization for research data collection: is the juice worth the squeeze?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making