PyPads

Thomas Weißgerber,Mehdi Ben Amor,Christofer Fellicious,Michael Granitzer

doi:10.1007/s13222-023-00459-w

PyPads

Thomas Weißgerber, Mehdi Ben Amor + Show 2 more

PDF Available

https://doi.org/10.1007/s13222-023-00459-w

Copy DOI

Export

Save

Cite

Journal: Datenbank-Spektrum	Publication Date: Nov 28, 2023
License type: CC BY 4.0

Affiliation: University of Passau

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Despite algorithmic advancements in the field of machine learning, a need for improvement in the infrastructure supporting machine learning development and research has become increasingly apparent. Machine learning experiments usually tend to be more ad-hoc in nature, and results are communicated most often in the form of a publication. Experimental details are often omitted due to size or time constraints, or simply because the complexity in terms of technical setup or parametrization became intractable. Even access to code bases, disregard important properties of the environment and experimental setup, like for example random generators or computing infrastructure. At the same time, tracking and communicating an often inherently exploratory scientific process is a task with considerable effort. We explored different venues to tackle these issues from a data science engineering point of view. The efforts resulted in PyPads, a framework providing an infrastructure to extend experimental setups with logging, communication and analysis features in a mostly non-intrusive way. PyPads can be extended to different Python-based frameworks, utilizing community driven, descriptive metadata in an effort to harmonize library specific logs in an ontology. Meanwhile, we also try to emphasize similarities to practices in software engineering, which have turned out to be essential in practical applications.

Full Text