Reproducibility is a major feature of Science. Even agronomic research of exemplary quality may have irreproducible empirical findings because of random or systematic error. The ability to reproduce agronomic experiments based on statistical data and legacy scripts are not easily achieved. We propose RFlow, a tool that aid researchers to manage, share, and enact the scientific experiments that encapsulate legacy R scripts. RFlow transparently captures provenance of scripts and endows experiments reproducibility. Unlike existing computational approaches, RFlow is non-intrusive, does not require users to change their working way, it wraps agronomic experiments in a scientific workflow system. Our computational experiments show that the tool can collect different types of provenance metadata of real experiments and enrich agronomic data with provenance metadata. This study shows the potential of RFlow to serve as the primary integration platform for legacy R scripts, with implications for other data- and compute-intensive agronomic projects.
Read full abstract