Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows

Martin Uhrin,Sebastiaan P Huber,Jusong Yu,Nicola Marzari,Giovanni Pizzi

doi:10.1016/j.commatsci.2020.110086

Abstract

Over the last two decades, the field of computational science has seen a dramatic shift towards incorporating high-throughput computation and big-data analysis as fundamental pillars of the scientific discovery process. This has necessitated the development of tools and techniques to deal with the generation, storage and processing of large amounts of data. In this work we present an in-depth look at the workflow engine powering AiiDA, a widely adopted, highly flexible and database-backed informatics infrastructure with an emphasis on data reproducibility. We detail many of the design choices that were made which were informed by several important goals: the ability to scale from running on individual laptops up to high-performance supercomputers, managing jobs with runtimes spanning from fractions of a second to weeks and scaling up to thousands of jobs concurrently, and all this while maximising robustness. In short, AiiDA aims to be a Swiss army knife for high-throughput computational science. As well as the architecture, we outline important API design choices made to give workflow writers a great deal of liberty whilst guiding them towards writing robust and modular workflows, ultimately enabling them to encode their scientific knowledge to the benefit of the wider scientific community.

Highlights

As developments in computational power have steadily and tremendously increased over the past few decades, so with them the field of computational science
We detail many of the design choices that were made which were informed by several important goals: the ability to scale from running on individual laptops up to high-performance supercomputers, managing jobs with runtimes spanning from fractions of a second to weeks and scaling up to thousands of jobs concurrently, and all this while maximising robustness
As well as the architecture, we outline important application programming interface (API) design choices made to give workflow writers a great deal of liberty whilst guiding them towards writing robust and modular workflows, enabling them to encode their scientific knowledge to the benefit of the wider scientific community

Summary

INTRODUCTION

As developments in computational power have steadily and tremendously increased over the past few decades, so with them the field of computational science. A workflow can insert new steps or spawn additional logical branches while it is running, based on intermediate results produced by previously completed steps While this enables runtimemutable workflows, specific mutations are bound by the constraints of the custom static JSON markup language through which they are defined. Workflows in AiiDA are implemented directly in Python and as such have all the dynamic expressiveness of a programming language directly at their disposal, as well as full access to the entire provenance graph with the data that is already stored in the database This proves to be a very powerful mechanism to deal with, for example, the problem of error handling when running high-throughput simulations. We first describe the user interface followed by a technical description of the architecture and implementation of the engine

USER INTERFACE

Process specification

Ports and port namespaces

Inputs and outputs

Exit codes

Work functions

Work chains

Calculation jobs

ARCHITECTURE

The engine

Vertical scaling

The process

Persistence

Communication

CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational Materials Science	Publication Date: Nov 16, 2020
Citations: 82	License type: cc-by

R Discovery Prime

R Discovery Prime

Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Materials Science

Lead the way for us

Similar Papers

Preface
-
Journal of Physics: Conference Series | VOL. 1255
--
01 Aug 2019
Journal of Physics: Conference Series | VOL. 1255

Excavating Big Data associated to Indian Elections Scenario via Apache Hadoop
...
International Journal of Advanced Research in Computer Science | VOL. 7
, et. al. ...
01 Jan 2015
International Journal of Advanced Research in Computer Science | VOL. 7

Machine learning in glaucoma: a bibliometric analysis comparing computer science and medical fields’ research
Saif Aldeen Alryalat ... Soukaina Ryalat
Expert Review of Ophthalmology | VOL. ahead-of-print
Saif Aldeen Alryalat, et. al.Saif Aldeen Alryalat ... Soukaina Ryalat
13 Aug 2021
Expert Review of Ophthalmology | VOL. ahead-of-print

Secondary School – Higher Education Institution Continuity on the Example of Teaching Computer Science and Information Technology
...
Scholarly Notes of Transbaikal State University | VOL. 19
, et. al. ...
01 Sep 2024
Scholarly Notes of Transbaikal State University | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Workflows in AiiDA: Engineering a high-throughput, event-based engine for robust and modular computational workflows

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational Materials Science