Abstract

As data analytics is used in business to increase profits, organizations use it to pursue their goals. Even if enterprise data could be already valuable on its own, in many cases, combining it with external data sources would boost the value of the output, making data sharing a need in data analytics. At the same time, organizations are reluctant to share data, as they are scared of disclosing critical information. This calls for solutions that are able to safeguard data holders by regulating how data can be shared to ensure the so-called data sovereignty. This paper focuses on the usage of data lakes as well-established technology across enterprises for data analytics where internal or publicly available data are considered. The goal is to extend data lakes with functionalities that, respecting the data sovereignty, enable a data lake also to be ingested with data shared by other organizations and to share data to external organizations. Notable, the purpose of this work is to face this issue by defining an architecture that, inserted in a federated environment: restricts data access and enables monitoring that the actual usage of data respects the data sovereignty expressed in the policies agreed upon by the involved parties; makes use of Blockchain technology as a means for guaranteeing the traceability of data sharing; and allows for balancing computation movement and data movement. The proposed approach has been applied to a healthcare scenario where several institutions (e.g., hospitals and clinics, research institutes, and medical universities) produce and collect clinical data in local data lakes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.