Will it run?-A proof of concept for smoke testing decentralized data analytics experiments.

Sascha Welten,Sven Weber,Adrian Holt,Oya Beyan,Stefan Decker

doi:10.3389/fmed.2023.1305415

Abstract

The growing interest in data-driven medicine, in conjunction with the formation of initiatives such as the European Health Data Space (EHDS) has demonstrated the need for methodologies that are capable of facilitating privacy-preserving data analysis. Distributed Analytics (DA) as an enabler for privacy-preserving analysis across multiple data sources has shown its potential to support data-intensive research. However, the application of DA creates new challenges stemming from its distributed nature, such as identifying single points of failure (SPOFs) in DA tasks before their actual execution. Failing to detect such SPOFs can, for example, result in improper termination of the DA code, necessitating additional efforts from multiple stakeholders to resolve the malfunctions. Moreover, these malfunctions disrupt the seamless conduct of DA and entail several crucial consequences, including technical obstacles to resolve the issues, potential delays in research outcomes, and increased costs. In this study, we address this challenge by introducing a concept based on a method called Smoke Testing, an initial and foundational test run to ensure the operability of the analysis code. We review existing DA platforms and systematically extract six specific Smoke Testing criteria for DA applications. With these criteria in mind, we create an interactive environment called Development Environment for AuTomated and Holistic Smoke Testing of Analysis-Runs (DEATHSTAR), which allows researchers to perform Smoke Tests on their DA experiments. We conduct a user-study with 29 participants to assess our environment and additionally apply it to three real use cases. The results of our evaluation validate its effectiveness, revealing that 96.6% of the analyses created and (Smoke) tested by participants using our approach successfully terminated without any errors. Thus, by incorporating Smoke Testing as a fundamental method, our approach helps identify potential malfunctions early in the development process, ensuring smoother data-driven research within the scope of DA. Through its flexibility and adaptability to diverse real use cases, our solution enables more robust and efficient development of DA experiments, which contributes to their reliability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in medicine	Publication Date: Jan 8, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Will it run?-A proof of concept for smoke testing decentralized data analytics experiments.

Abstract

Talk to us

Similar Papers

More From: Frontiers in medicine

Lead the way for us

Similar Papers

A Privacy-Preserving Distributed Analytics Platform for Health Care Data.
Stefan Decker ... Oya Beyan
Methods of Information in Medicine | VOL. 61
Stefan Decker, et. al.Stefan Decker ... Oya Beyan
17 Jan 2022
Methods of Information in Medicine | VOL. 61

Multi-Institutional Breast Cancer Detection Using a Secure On-Boarding Service for Distributed Analytics
Sascha Welten ... Mehrshad Jaberansary
Applied Sciences | VOL. 12
Sascha Welten, et. al.Sascha Welten ... Mehrshad Jaberansary
25 Apr 2022
Applied Sciences | VOL. 12

Approach to Classifying Freight Data Elements across Multiple Data Sources
Dan P K Seedah ... Bharathwaj Sankaran
Transportation Research Record: Journal of the Transportation Research Board | VOL. 2529
Dan P K Seedah, et. al.Dan P K Seedah ... Bharathwaj Sankaran
01 Jan 2015
Transportation Research Record: Journal of the Transportation Research Board | VOL. 2529

Clustering on Multi-source Incomplete Data via Tensor Modeling and Factorization
Weixiang Shao ... Philip S Yu
-
Weixiang Shao, et. al.Weixiang Shao ... Philip S Yu
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Will it run?-A proof of concept for smoke testing decentralized data analytics experiments.

Abstract

Talk to us

Similar Papers

More From: Frontiers in medicine