Abstract

Machine/deep learning (DL) predictions of sustainable aviation fuel’s (SAF) physiochemical properties from chemical data offers a rapid way to prescreen the potential viability of new SAF candidates but is limited by uncertainties. In this article, the uncertainties arising from insufficient training data (epistemic) and finite-resolution chemical features (heteroscedastic) are addressed by conducting a deep uncertainty quantification (UQ) study using a Bayesian neural network ensemble (BNNE) to model and analyze such uncertainties. In particular, flash point is predicted from two-dimensional gas chromatography (GC×GC) features in various scenarios where differences in epistemicity and heteroscedasticity exist. Several insights are obtained: (1) Overparameterization of the network provides buffer against epistemicity and should be advocated in the absence of sufficient data. (2) Reducing the epistemic uncertainty via GC×GC localization does not always improve accuracy, highlighting the necessity of a probabilistic formulation to prevent overconfident but erroneous predictions. (3) Heteroscedastic uncertainty is larger and irreducible for lower resolution features, e.g., GC separated by chemical family but not molecular formulae. These findings aim not only to facilitate trustworthy DL practices in SAF modeling but also to emphasize the importance of establishing a big data pipeline and the design of finer features (e.g., isomer differentiation via vacuum ultraviolet spectroscopy) to mitigate these uncertainties.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call