Data Ecosystems for Scientific Experiments: Managing Combustion Experiments and Simulation Analyses in Chemical Engineering.

Edoardo Ramalli,Gabriele Scalia,Alessandro Stagni,Tiziano Faravelli,Barbara Pernici,Alberto Cuoci

doi:10.3389/fdata.2021.663410

Edoardo Ramalli, Gabriele Scalia + Show 4 more

Open Access

https://doi.org/10.3389/fdata.2021.663410

Copy DOI

Journal: Frontiers in Big Data	Publication Date: Sep 15, 2021
Citations: 6	License type: CC BY 4.0

Affiliation: Politecnico di Milano

Abstract

The development of scientific predictive models has been of great interest over the decades. A scientific model is capable of forecasting domain outcomes without the necessity of performing expensive experiments. In particular, in combustion kinetics, the model can help improving the combustion facilities and the fuel efficiency reducing the pollutants. At the same time, the amount of available scientific data has increased and helped speeding up the continuous cycle of model improvement and validation. This has also opened new opportunities for leveraging a large amount of data to support knowledge extraction. However, experiments are affected by several data quality problems since they are a collection of information over several decades of research, each characterized by different representation formats and reasons of uncertainty. In this context, it is necessary to develop an automatic data ecosystem capable of integrating heterogeneous information sources while maintaining a quality repository. We present an innovative approach to data quality management from the chemical engineering domain, based on an available prototype of a scientific framework, SciExpeM, which has been significantly extended. We identified a new methodology from the model development research process that systematically extracts knowledge from the experimental data and the predictive model. In the paper, we show how our general framework could support the model development process, and save precious research time also in other experimental domains with similar characteristics, i.e., managing numerical data from experiments.

Highlights

One of the characteristics of Industry 4.0 is the availability of vast amounts of experimental data that facilitates the development and refinement of predictive models
It is not easy to estimate a priori the effort needed for this procedure, but, for example, the pipeline that we have proposed is general enough to be applied to most of the experimental domain, ensuring data quality and predictive model improvement
The development of scientific models has always been of great interest because they can be used to represent a domain

Summary

Introduction

One of the characteristics of Industry 4.0 is the availability of vast amounts of experimental data that facilitates the development and refinement of predictive models. The need emerged to systematically store and manage large quantities of experimental data collected and shared by various stakeholders. SciExpeM incompatible representations of data, assessing and guaranteeing agreed quality levels, and preserving property rights while sharing data. Such systems are being discussed in the Industry 4.0 domain Data repurposing for analysis and the development of models with AI technologies require a mutual understanding of the data and their associated characteristics

Methods

Results

Discussion

Conclusion