Reliable uncertainty quantification on RUL prediction is crucial for informative decision-making in predictive maintenance. In this context, we assess some of the latest developments in the field of uncertainty quantification for deep learning prognostics. This includes the state-of-the-art variational inference algorithms for Bayesian neural networks (BNN) as well as popular alternatives such as Monte Carlo Dropout (MCD), deep ensembles (DE), and heteroscedastic neural networks (HNN). All the inference techniques share the same inception architecture as functional model. The performance of the methods is evaluated on a subset of the large NASA N-CMAPSS dataset for aircraft engines. The assessment includes RUL prediction accuracy, the quality of predictive uncertainty, and the possibility of breaking down the total predictive uncertainty into its aleatoric and epistemic parts. Although all methods are close in terms of accuracy, we find differences in the way they estimate uncertainty. Thus, DE and MCD generally provide more conservative predictive uncertainty than BNN. Surprisingly, HNN achieve strong results without the added complexity of BNN. None of these methods exhibited strong robustness to out-of-distribution cases, with BNN and HNN methods particularly susceptible to low accuracy and overconfidence. BNN techniques presented anomalous miscalibration issues at the later stages of the system lifetime.
Read full abstract