Abstract

Many scientific endeavors, such as molecular biology, have become dependent on big data and its analysis. For example, precision medicine depends on molecular measurements and data analysis per patient. Data analyses supporting medical decisions must be standardized and performed consistently across patients. While perhaps not life-threatening, data analyses in basic research have become increasingly complex. RNAseq data, for example, entails a multi-step analysis ranging from quality assessment of the measurements to statistical analyses. Workflow management systems (WFMS) enable the development of data analysis workflows (WF), their reproduction, and their application to datasets of the same type. However, far more than a hundred WFMS are available, and there is no way to convert data analysis WFs among WFMS. Therefore, the initial choice of a WFMS is important as it entails a lockin to the system. The reach in their particular field (number of citations) can be used as a proxy for selecting a WFMS, but of the about 25 WFMS we mention in this work, at least 5 have a large reach in scientific data analysis. Hence other criteria are needed to delineate among WFMS. By extracting such criteria from selected studies concerning WFMS and adding additional criteria, we arrived at five critical criteria: reproducibility, reusability, FAIRness, versioning support, and security. Another five criteria (providing a graphical user interface, WF flexibility, WF scalability, WF shareability, and computational transparency) we deemed important but not critical for the assessment of WFMS. We applied the criteria to the most cited WFMS in PubMed and found none that support all criteria. We hope that suggesting these criteria will spark a discussion on what features are important for WFMS in scientific data analysis and may lead to developing WFMS that fulfill such criteria.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call