Target Detection on Hyperspectral Images Using MCMC and VI Trained Bayesian Neural Networks

Daniel Ries,Jason Adams,Joshua Zollweg

doi:10.1109/aero53065.2022.9843344

Abstract

Neural networks (NN) have become almost ubiquitous with image classification, but in their standard form produce point estimates, with no measure of confidence. Bayesian neural networks (BNN) provide uncertainty quantification (UQ) for NN predictions and estimates through the posterior distribution. As NN are applied in more high-consequence applications, UQ is becoming a requirement. Automating systems can save time and money, but only if the operator can trust what the system outputs. BNN provide a solution to this problem by not only giving accurate predictions and estimates, but also an interval that includes reasonable values within a desired probability. Despite their positive attributes, BNN are notoriously difficult and time consuming to train. Traditional Bayesian methods use Markov Chain Monte Carlo (MCMC), but this is often brushed aside as being too slow. The most common method is variational inference (VI) due to its fast computation, but there are multiple concerns with its efficacy. MCMC is the gold standard and given enough time, will produce the correct result. VI, alternatively, is an approximation that converges asymptotically. Unfortunately (or fortunately), high consequence problems often do not live in the land of asymtopia so solutions like MCMC are preferable to approximations. We apply and compare MCMC- and VI-trained BNN in the context of target detection in hyperspectral imagery (HSI), where materials of interest can be identified by their unique spectral signature. This is a challenging field, due to the numerous permuting effects practical collection of HSI has on measured spectra. Both models are trained using out-of-the-box tools on a high fidelity HSI target detection scene. Both MCMC- and VI-trained BNN perform well overall at target detection on a simulated HSI scene. Splitting the test set predictions into two classes, high confidence and low confidence predictions, presents a path to automation. For the MCMC-trained BNN, the high confidence predictions have a 0.95 probability of detection with a false alarm rate of 0.05 when considering pixels with target abundance of 0.2. VI-trained BNN have a 0.25 probability of detection for the same, but its performance on high confidence sets matched MCMC for abundances >0.4. However, the VI-trained BNN on this scene required significant expert tuning to get these results while MCMC worked immediately. On neither scene was MCMC prohibitively time consuming, as is often assumed, but the networks we used were relatively small. This paper provides an example of how to utilize the benefits of UQ, but also to increase awareness that different training methods can give different results for the same model. If sufficient computational resources are available, the best approach rather than the fastest or most efficient should be used, especially for high consequence problems.

Full Text