Abstract
In recent years, as newer technologies have evolved around the healthcare ecosystem, more and more data have been generated. Advanced analytics could power the data collected from numerous sources, both from healthcare institutions, or generated by individuals themselves via apps and devices, and lead to innovations in treatment and diagnosis of diseases; improve the care given to the patient; and empower citizens to participate in the decision-making process regarding their own health and well-being. However, the sensitive nature of the health data prohibits healthcare organizations from sharing the data. The Personal Health Train (PHT) is a novel approach, aiming to establish a distributed data analytics infrastructure enabling the (re)use of distributed healthcare data, while data owners stay in control of their own data. The main principle of the PHT is that data remain in their original location, and analytical tasks visit data sources and execute the tasks. The PHT provides a distributed, flexible approach to use data in a network of participants, incorporating the FAIR principles. It facilitates the responsible use of sensitive and/or personal data by adopting international principles and regulations. This paper presents the concepts and main components of the PHT and demonstrates how it complies with FAIR principles.
Highlights
MOVING FROM CENTRALIZED DATA SHARING TO EMPOWERING DATA OWNERS TO GAIN CONTROL OVER DATA REUSEData-driven technologies are changing business, our daily lives, and the way we conduct research more than ever
The Personal Health Train (PHT) is a novel approach establishing a FAIR distributed data analytics infrastructure enabling theuse of distributed healthcare data, while data owners stay in control of their own data
In summary the PHT: (i) empowers citizens and organizations to control the use of the data that reside in their own data repositories for the benefit of the individual and society, (ii) improves the usability of health data by lowering the barriers for data protection, by ensuring that the privacy and confidentiality of the data subject will be preserved, (iii) ensures data sovereignty beyond data security and privacy by supporting the responsible use and builds trust between data consumers and data owners by making analytics processes repeatable, transparent and auditable, (iv) applies FAIR principles to the protocols of how data analytics interacts with FAIR data points by making data analytics tasks itself FAIR and placing machine readability at its core
Summary
MOVING FROM CENTRALIZED DATA SHARING TO EMPOWERING DATA OWNERS TO GAIN CONTROL OVER DATA REUSE. Many healthcare institutions implement centralized repositories by pooling data from multiple systems into data warehouses or data lakes [10] Sharing these data out of the organization’s boundaries is not a viable solution since the anonymization of data may not be possible for certain data types such as genomic data and since linking data sets increases the re-identification risk. Rather than moving the data to the requester, it moves the analytics tasks to the data repositories and executes the tasks in a secure environment In this approach, the owner of the data can remain in control and decide which part of the data will be analyzed for which specific purposes and by whom. We will demonstrate the application of FAIR principles to the Personal Health Train approach
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have