Version [formula omitted]- [SAMbA-RaP is music to scientists’ ears: Adding provenance support to spark-based scientific workflows]

Thaylon Guedes,Marta Mattoso,Marcos Bedo,Daniel De Oliveira

doi:10.1016/j.softx.2024.101927

Version [formula omitted]- [SAMbA-RaP is music to scientists’ ears: Adding provenance support to spark-based scientific workflows]

Thaylon Guedes, Marta Mattoso + Show 2 more

https://doi.org/10.1016/j.softx.2024.101927

Copy DOI

Journal: SoftwareX

Publication Date: Oct 17, 2024

#In-memory Data Structures #Black-box Applications + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

While researchers benefit from Apache Spark for executing scientific workflows at scale, they often lack provenance support due to the framework’s design limitations. This paper presents SAMbA-RaP, a provenance extension for Apache Spark. It focuses on: (i) Executing external, black-box applications with intensive I/O operations within the workflow while leveraging Spark’s in-memory data structures, (ii) Extracting domain-specific data from in-memory data structures and (iii) Implementing data versioning and capturing the provenance graph in a workflow execution. SAMbA-RaP also provides real-time reports via a web interface, enabling scientists to explore dataflow transformations and content evolution as they run workflows.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: SoftwareX

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.